Handling Null values (and equivalents) routinely in PythonCombinatorial searching of Huffman treesHackerRank “Manasa and Stones” in PythonValidation null values (preconditions)Python Octree ImplementationAsk the user for two numbers, then add or multiply themMaximize the number of PyGame sprites, with collision detectionGiven an integer s and a list of integers ints finding (x,y) in ints such that s = x + ySplit list of integers at certain value efficientlyAdvent of Code 2017, Day 8 - Performing simple instructionsTotal of maximum values in all subarrays
I want to write a blog post building upon someone else's paper, how can I properly cite/credit them?
What did Varys actually mean?
All of my Firefox add-ons have been disabled suddenly, how can I re-enable them?
Extracting the parent, leaf, and extension from a valid path
How can I test a shell script in a "safe environment" to avoid harm to my computer?
Why did Dr. Strange keep looking into the future after the snap?
In a series of books, what happens after the coming of age?
How to increase row height of a table and vertically "align middle"?
What does the copyright in a dissertation protect exactly?
A♭ major 9th chord in Bach is unexpectedly dissonant/jazzy
The unknown and unexplained in science fiction
How could a humanoid creature completely form within the span of 24 hours?
Why was Gemini VIII terminated after recovering from the OAMS thruster failure?
What detail can Hubble see on Mars?
What is more safe for browsing the web: PC or smartphone?
Magical Modulo Squares
Convert a huge txt-file into a dataset
Bash prompt takes only the first word of a hostname before the dot
Select list elements based on other list
A problem with Hebrew and English underlined text
How is it believable that Euron could so easily pull off this ambush?
In the figure, a quarter circle, a semicircle and a circle are mutually tangent inside a square of side length 2. Find the radius of the circle.
Did Ham the Chimp follow commands, or did he just randomly push levers?
shebang or not shebang
Handling Null values (and equivalents) routinely in Python
Combinatorial searching of Huffman treesHackerRank “Manasa and Stones” in PythonValidation null values (preconditions)Python Octree ImplementationAsk the user for two numbers, then add or multiply themMaximize the number of PyGame sprites, with collision detectionGiven an integer s and a list of integers ints finding (x,y) in ints such that s = x + ySplit list of integers at certain value efficientlyAdvent of Code 2017, Day 8 - Performing simple instructionsTotal of maximum values in all subarrays
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I've found the following code invaluable in helping me 'handle' None values including "whitespace" characters that should be treated as None based on the situation. I have been using this code for quite some time now:
class _MyUtils:
def __init__(self):
pass
def _mynull(self, myval, myalt, mystrip=True, mynullstrings=["", "None"], mynuminstances=(int, float)):
# if the value is None, return the alternative immediately.
if myval is None:
return myalt
# if the value is a number, it is not None - so return the original
elif isinstance(myval, mynuminstances):
return myval
# if the mystrip parameter is true, strip the original and test that
else:
if mystrip:
testval = myval.strip()
else:
testval = myval
# if mynullstrings are populated, check if the upper case of the
# original value matches the upper case of any item in the list.
# return the alternative if so.
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
def main():
x = _MyUtils()
print(x._mynull(None, "alternative_value", True, [""]))
if __name__ == '__main__':
main()
The code requires an input, an alternative to provide if input is found to be Null, whether to 'strip' the input during testing (if not a number), values to treat as 'equivalent' to None and types of number instances to determine if the input is numeric (and hence not none).
Essentially, too many processes that we run depend upon not having None values in the data being processed—whether that be lambda functions, custom table toolsets, etc. This code gives me the ability to handle None values predictably, but I am sure there is a better approach here. Is there a more Pythonic way of doing this? How can this code be improved? How would others approach this problem?
python null
$endgroup$
migrated from stackoverflow.com May 3 at 16:32
This question came from our site for professional and enthusiast programmers.
add a comment |
$begingroup$
I've found the following code invaluable in helping me 'handle' None values including "whitespace" characters that should be treated as None based on the situation. I have been using this code for quite some time now:
class _MyUtils:
def __init__(self):
pass
def _mynull(self, myval, myalt, mystrip=True, mynullstrings=["", "None"], mynuminstances=(int, float)):
# if the value is None, return the alternative immediately.
if myval is None:
return myalt
# if the value is a number, it is not None - so return the original
elif isinstance(myval, mynuminstances):
return myval
# if the mystrip parameter is true, strip the original and test that
else:
if mystrip:
testval = myval.strip()
else:
testval = myval
# if mynullstrings are populated, check if the upper case of the
# original value matches the upper case of any item in the list.
# return the alternative if so.
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
def main():
x = _MyUtils()
print(x._mynull(None, "alternative_value", True, [""]))
if __name__ == '__main__':
main()
The code requires an input, an alternative to provide if input is found to be Null, whether to 'strip' the input during testing (if not a number), values to treat as 'equivalent' to None and types of number instances to determine if the input is numeric (and hence not none).
Essentially, too many processes that we run depend upon not having None values in the data being processed—whether that be lambda functions, custom table toolsets, etc. This code gives me the ability to handle None values predictably, but I am sure there is a better approach here. Is there a more Pythonic way of doing this? How can this code be improved? How would others approach this problem?
python null
$endgroup$
migrated from stackoverflow.com May 3 at 16:32
This question came from our site for professional and enthusiast programmers.
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
1
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday
add a comment |
$begingroup$
I've found the following code invaluable in helping me 'handle' None values including "whitespace" characters that should be treated as None based on the situation. I have been using this code for quite some time now:
class _MyUtils:
def __init__(self):
pass
def _mynull(self, myval, myalt, mystrip=True, mynullstrings=["", "None"], mynuminstances=(int, float)):
# if the value is None, return the alternative immediately.
if myval is None:
return myalt
# if the value is a number, it is not None - so return the original
elif isinstance(myval, mynuminstances):
return myval
# if the mystrip parameter is true, strip the original and test that
else:
if mystrip:
testval = myval.strip()
else:
testval = myval
# if mynullstrings are populated, check if the upper case of the
# original value matches the upper case of any item in the list.
# return the alternative if so.
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
def main():
x = _MyUtils()
print(x._mynull(None, "alternative_value", True, [""]))
if __name__ == '__main__':
main()
The code requires an input, an alternative to provide if input is found to be Null, whether to 'strip' the input during testing (if not a number), values to treat as 'equivalent' to None and types of number instances to determine if the input is numeric (and hence not none).
Essentially, too many processes that we run depend upon not having None values in the data being processed—whether that be lambda functions, custom table toolsets, etc. This code gives me the ability to handle None values predictably, but I am sure there is a better approach here. Is there a more Pythonic way of doing this? How can this code be improved? How would others approach this problem?
python null
$endgroup$
I've found the following code invaluable in helping me 'handle' None values including "whitespace" characters that should be treated as None based on the situation. I have been using this code for quite some time now:
class _MyUtils:
def __init__(self):
pass
def _mynull(self, myval, myalt, mystrip=True, mynullstrings=["", "None"], mynuminstances=(int, float)):
# if the value is None, return the alternative immediately.
if myval is None:
return myalt
# if the value is a number, it is not None - so return the original
elif isinstance(myval, mynuminstances):
return myval
# if the mystrip parameter is true, strip the original and test that
else:
if mystrip:
testval = myval.strip()
else:
testval = myval
# if mynullstrings are populated, check if the upper case of the
# original value matches the upper case of any item in the list.
# return the alternative if so.
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
def main():
x = _MyUtils()
print(x._mynull(None, "alternative_value", True, [""]))
if __name__ == '__main__':
main()
The code requires an input, an alternative to provide if input is found to be Null, whether to 'strip' the input during testing (if not a number), values to treat as 'equivalent' to None and types of number instances to determine if the input is numeric (and hence not none).
Essentially, too many processes that we run depend upon not having None values in the data being processed—whether that be lambda functions, custom table toolsets, etc. This code gives me the ability to handle None values predictably, but I am sure there is a better approach here. Is there a more Pythonic way of doing this? How can this code be improved? How would others approach this problem?
python null
python null
edited May 3 at 17:46
200_success
132k20159424
132k20159424
asked May 3 at 13:02
lb_solb_so
312
312
migrated from stackoverflow.com May 3 at 16:32
This question came from our site for professional and enthusiast programmers.
migrated from stackoverflow.com May 3 at 16:32
This question came from our site for professional and enthusiast programmers.
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
1
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday
add a comment |
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
1
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
1
1
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
Generally I don't think you should have a class for this functionality. There's no state and no particular meaning to MyUtils
object here. You can make this into a long function in whatever module you deem appropriate in your codebase.
I think this function as written is a code smell. It 1) doesn't cover a whole lot of types and 2) implies that where you're using it you're not going to have even a rough idea of what type of data you're expecting. In most cases you will have some idea, and even then it's not usually a good idea to do explicit type checking.
Where you're using this for numbers you can replace it with myval if myval is not None else mydefault
.
A function like this may be more useful for strings, for which there are a wider range of essentially empty values. Perhaps something like this
def safe_string(s, default="", blacklist=["None"]):
if s is None or len(s.strip()) == 0:
return default
if s.upper() in [b.upper() for b in blacklist]:
return default
return s
$endgroup$
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
Apart from the "blacklist" feature, you can in many cases just use or
to use a "default" value if the first argument is falsy. Some example:
>>> "foo" or "default"
'foo'
>>> "" or "default"
'default'
>>> None or "default"
'default'
And similar for numbers, lists, etc.
for x in list_that_could_be_none or []:
print(x * (number_that_could_be_none or 0))
But note that any non-empty string is truthy (but you can still strip
):
>>> " " or "default"
' '
>>> " ".strip() or "default"
'default'
$endgroup$
add a comment |
$begingroup$
This loop could be rewritten:
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
as:
if testval.upper() in [ns.upper() for ns in mynullstrings]:
return myalt
else:
return myval
I would also rewrite this:
if mystrip:
testval = myval.strip()
else:
testval = myval
as:
if mystrip:
myval= myval.strip()
and continue to use myval
. This seems clearer to me.
Personally, I don't think prepending 'my' is a good style—variable names should be descriptive in and of themselves.
$endgroup$
2
$begingroup$
also note thatstr.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308
$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
further to everything else that's been written I find it's generally better for functions to raise an exception if the wrong data-type is propagated, I'd therefore discourage use of code that special cases things like your checking for int
s and float
s. I'd write the function as:
def replace_null(text, *, empty_is_null=True, strip=True, nulls=('NULL', 'None')):
"""Return None if text represents 'none', otherwise text with whitespace stripped."""
if text is None:
return None
if strip:
text = str.strip(text)
if empty_is_null and not text:
return None
if str.casefold(text) in (s.casefold() for s in nulls):
return None
return text
The asterisk (*
) indicates keyword-only arguments (see PEP 3102) as I think it would help with future readers of the code. For example I would probably have to look at the definition to determine what:
x = myobj._mynull(text, 'default', False)
does, especially the unqualified False
, when compared to (assuming the above is saved in utils.py
):
x = utils.replace_null(text, strip=False) or 'default'
which relies more on keyword arguments and standard Python semantics.
I've also added a small docstring, so that help(replace_null)
works.
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219649%2fhandling-null-values-and-equivalents-routinely-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Generally I don't think you should have a class for this functionality. There's no state and no particular meaning to MyUtils
object here. You can make this into a long function in whatever module you deem appropriate in your codebase.
I think this function as written is a code smell. It 1) doesn't cover a whole lot of types and 2) implies that where you're using it you're not going to have even a rough idea of what type of data you're expecting. In most cases you will have some idea, and even then it's not usually a good idea to do explicit type checking.
Where you're using this for numbers you can replace it with myval if myval is not None else mydefault
.
A function like this may be more useful for strings, for which there are a wider range of essentially empty values. Perhaps something like this
def safe_string(s, default="", blacklist=["None"]):
if s is None or len(s.strip()) == 0:
return default
if s.upper() in [b.upper() for b in blacklist]:
return default
return s
$endgroup$
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
Generally I don't think you should have a class for this functionality. There's no state and no particular meaning to MyUtils
object here. You can make this into a long function in whatever module you deem appropriate in your codebase.
I think this function as written is a code smell. It 1) doesn't cover a whole lot of types and 2) implies that where you're using it you're not going to have even a rough idea of what type of data you're expecting. In most cases you will have some idea, and even then it's not usually a good idea to do explicit type checking.
Where you're using this for numbers you can replace it with myval if myval is not None else mydefault
.
A function like this may be more useful for strings, for which there are a wider range of essentially empty values. Perhaps something like this
def safe_string(s, default="", blacklist=["None"]):
if s is None or len(s.strip()) == 0:
return default
if s.upper() in [b.upper() for b in blacklist]:
return default
return s
$endgroup$
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
Generally I don't think you should have a class for this functionality. There's no state and no particular meaning to MyUtils
object here. You can make this into a long function in whatever module you deem appropriate in your codebase.
I think this function as written is a code smell. It 1) doesn't cover a whole lot of types and 2) implies that where you're using it you're not going to have even a rough idea of what type of data you're expecting. In most cases you will have some idea, and even then it's not usually a good idea to do explicit type checking.
Where you're using this for numbers you can replace it with myval if myval is not None else mydefault
.
A function like this may be more useful for strings, for which there are a wider range of essentially empty values. Perhaps something like this
def safe_string(s, default="", blacklist=["None"]):
if s is None or len(s.strip()) == 0:
return default
if s.upper() in [b.upper() for b in blacklist]:
return default
return s
$endgroup$
Generally I don't think you should have a class for this functionality. There's no state and no particular meaning to MyUtils
object here. You can make this into a long function in whatever module you deem appropriate in your codebase.
I think this function as written is a code smell. It 1) doesn't cover a whole lot of types and 2) implies that where you're using it you're not going to have even a rough idea of what type of data you're expecting. In most cases you will have some idea, and even then it's not usually a good idea to do explicit type checking.
Where you're using this for numbers you can replace it with myval if myval is not None else mydefault
.
A function like this may be more useful for strings, for which there are a wider range of essentially empty values. Perhaps something like this
def safe_string(s, default="", blacklist=["None"]):
if s is None or len(s.strip()) == 0:
return default
if s.upper() in [b.upper() for b in blacklist]:
return default
return s
answered May 3 at 13:32
Quinn MortimerQuinn Mortimer
1694
1694
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
$begingroup$
agreed on the Class statement, thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
Apart from the "blacklist" feature, you can in many cases just use or
to use a "default" value if the first argument is falsy. Some example:
>>> "foo" or "default"
'foo'
>>> "" or "default"
'default'
>>> None or "default"
'default'
And similar for numbers, lists, etc.
for x in list_that_could_be_none or []:
print(x * (number_that_could_be_none or 0))
But note that any non-empty string is truthy (but you can still strip
):
>>> " " or "default"
' '
>>> " ".strip() or "default"
'default'
$endgroup$
add a comment |
$begingroup$
Apart from the "blacklist" feature, you can in many cases just use or
to use a "default" value if the first argument is falsy. Some example:
>>> "foo" or "default"
'foo'
>>> "" or "default"
'default'
>>> None or "default"
'default'
And similar for numbers, lists, etc.
for x in list_that_could_be_none or []:
print(x * (number_that_could_be_none or 0))
But note that any non-empty string is truthy (but you can still strip
):
>>> " " or "default"
' '
>>> " ".strip() or "default"
'default'
$endgroup$
add a comment |
$begingroup$
Apart from the "blacklist" feature, you can in many cases just use or
to use a "default" value if the first argument is falsy. Some example:
>>> "foo" or "default"
'foo'
>>> "" or "default"
'default'
>>> None or "default"
'default'
And similar for numbers, lists, etc.
for x in list_that_could_be_none or []:
print(x * (number_that_could_be_none or 0))
But note that any non-empty string is truthy (but you can still strip
):
>>> " " or "default"
' '
>>> " ".strip() or "default"
'default'
$endgroup$
Apart from the "blacklist" feature, you can in many cases just use or
to use a "default" value if the first argument is falsy. Some example:
>>> "foo" or "default"
'foo'
>>> "" or "default"
'default'
>>> None or "default"
'default'
And similar for numbers, lists, etc.
for x in list_that_could_be_none or []:
print(x * (number_that_could_be_none or 0))
But note that any non-empty string is truthy (but you can still strip
):
>>> " " or "default"
' '
>>> " ".strip() or "default"
'default'
answered May 3 at 13:37
tobias_ktobias_k
1,579916
1,579916
add a comment |
add a comment |
$begingroup$
This loop could be rewritten:
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
as:
if testval.upper() in [ns.upper() for ns in mynullstrings]:
return myalt
else:
return myval
I would also rewrite this:
if mystrip:
testval = myval.strip()
else:
testval = myval
as:
if mystrip:
myval= myval.strip()
and continue to use myval
. This seems clearer to me.
Personally, I don't think prepending 'my' is a good style—variable names should be descriptive in and of themselves.
$endgroup$
2
$begingroup$
also note thatstr.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308
$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
This loop could be rewritten:
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
as:
if testval.upper() in [ns.upper() for ns in mynullstrings]:
return myalt
else:
return myval
I would also rewrite this:
if mystrip:
testval = myval.strip()
else:
testval = myval
as:
if mystrip:
myval= myval.strip()
and continue to use myval
. This seems clearer to me.
Personally, I don't think prepending 'my' is a good style—variable names should be descriptive in and of themselves.
$endgroup$
2
$begingroup$
also note thatstr.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308
$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
This loop could be rewritten:
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
as:
if testval.upper() in [ns.upper() for ns in mynullstrings]:
return myalt
else:
return myval
I would also rewrite this:
if mystrip:
testval = myval.strip()
else:
testval = myval
as:
if mystrip:
myval= myval.strip()
and continue to use myval
. This seems clearer to me.
Personally, I don't think prepending 'my' is a good style—variable names should be descriptive in and of themselves.
$endgroup$
This loop could be rewritten:
if len(mynullstrings) > 0:
i = 0
for ns in mynullstrings:
if ns.upper() == testval.upper():
i = i + 1
break
if i > 0:
return myalt
else:
return myval
else:
return myval
as:
if testval.upper() in [ns.upper() for ns in mynullstrings]:
return myalt
else:
return myval
I would also rewrite this:
if mystrip:
testval = myval.strip()
else:
testval = myval
as:
if mystrip:
myval= myval.strip()
and continue to use myval
. This seems clearer to me.
Personally, I don't think prepending 'my' is a good style—variable names should be descriptive in and of themselves.
edited May 3 at 16:33
Cody Gray
3,490926
3,490926
answered May 3 at 13:16
lolopoplolopop
29416
29416
2
$begingroup$
also note thatstr.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308
$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
add a comment |
2
$begingroup$
also note thatstr.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308
$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
2
2
$begingroup$
also note that
str.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
also note that
str.casefold()
is recommended for comparing strings. see for example stackoverflow.com/q/45745661/1358308 and stackoverflow.com/q/40348174/1358308$endgroup$
– Sam Mason
May 3 at 16:46
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
$begingroup$
I really like how you have crushed my multi-line loops into a far more pythonic version here. Thank you.
$endgroup$
– lb_so
yesterday
add a comment |
$begingroup$
further to everything else that's been written I find it's generally better for functions to raise an exception if the wrong data-type is propagated, I'd therefore discourage use of code that special cases things like your checking for int
s and float
s. I'd write the function as:
def replace_null(text, *, empty_is_null=True, strip=True, nulls=('NULL', 'None')):
"""Return None if text represents 'none', otherwise text with whitespace stripped."""
if text is None:
return None
if strip:
text = str.strip(text)
if empty_is_null and not text:
return None
if str.casefold(text) in (s.casefold() for s in nulls):
return None
return text
The asterisk (*
) indicates keyword-only arguments (see PEP 3102) as I think it would help with future readers of the code. For example I would probably have to look at the definition to determine what:
x = myobj._mynull(text, 'default', False)
does, especially the unqualified False
, when compared to (assuming the above is saved in utils.py
):
x = utils.replace_null(text, strip=False) or 'default'
which relies more on keyword arguments and standard Python semantics.
I've also added a small docstring, so that help(replace_null)
works.
New contributor
$endgroup$
add a comment |
$begingroup$
further to everything else that's been written I find it's generally better for functions to raise an exception if the wrong data-type is propagated, I'd therefore discourage use of code that special cases things like your checking for int
s and float
s. I'd write the function as:
def replace_null(text, *, empty_is_null=True, strip=True, nulls=('NULL', 'None')):
"""Return None if text represents 'none', otherwise text with whitespace stripped."""
if text is None:
return None
if strip:
text = str.strip(text)
if empty_is_null and not text:
return None
if str.casefold(text) in (s.casefold() for s in nulls):
return None
return text
The asterisk (*
) indicates keyword-only arguments (see PEP 3102) as I think it would help with future readers of the code. For example I would probably have to look at the definition to determine what:
x = myobj._mynull(text, 'default', False)
does, especially the unqualified False
, when compared to (assuming the above is saved in utils.py
):
x = utils.replace_null(text, strip=False) or 'default'
which relies more on keyword arguments and standard Python semantics.
I've also added a small docstring, so that help(replace_null)
works.
New contributor
$endgroup$
add a comment |
$begingroup$
further to everything else that's been written I find it's generally better for functions to raise an exception if the wrong data-type is propagated, I'd therefore discourage use of code that special cases things like your checking for int
s and float
s. I'd write the function as:
def replace_null(text, *, empty_is_null=True, strip=True, nulls=('NULL', 'None')):
"""Return None if text represents 'none', otherwise text with whitespace stripped."""
if text is None:
return None
if strip:
text = str.strip(text)
if empty_is_null and not text:
return None
if str.casefold(text) in (s.casefold() for s in nulls):
return None
return text
The asterisk (*
) indicates keyword-only arguments (see PEP 3102) as I think it would help with future readers of the code. For example I would probably have to look at the definition to determine what:
x = myobj._mynull(text, 'default', False)
does, especially the unqualified False
, when compared to (assuming the above is saved in utils.py
):
x = utils.replace_null(text, strip=False) or 'default'
which relies more on keyword arguments and standard Python semantics.
I've also added a small docstring, so that help(replace_null)
works.
New contributor
$endgroup$
further to everything else that's been written I find it's generally better for functions to raise an exception if the wrong data-type is propagated, I'd therefore discourage use of code that special cases things like your checking for int
s and float
s. I'd write the function as:
def replace_null(text, *, empty_is_null=True, strip=True, nulls=('NULL', 'None')):
"""Return None if text represents 'none', otherwise text with whitespace stripped."""
if text is None:
return None
if strip:
text = str.strip(text)
if empty_is_null and not text:
return None
if str.casefold(text) in (s.casefold() for s in nulls):
return None
return text
The asterisk (*
) indicates keyword-only arguments (see PEP 3102) as I think it would help with future readers of the code. For example I would probably have to look at the definition to determine what:
x = myobj._mynull(text, 'default', False)
does, especially the unqualified False
, when compared to (assuming the above is saved in utils.py
):
x = utils.replace_null(text, strip=False) or 'default'
which relies more on keyword arguments and standard Python semantics.
I've also added a small docstring, so that help(replace_null)
works.
New contributor
New contributor
answered May 3 at 18:09
Sam MasonSam Mason
1313
1313
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219649%2fhandling-null-values-and-equivalents-routinely-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
This kind of imprecision is a reason why JavaScript can be so brittle. JavaScript is very liberal in converting values to other types. It's better to be strict about types and the values you allow for them. An unexpected value is a bug; it should not be silently corrected.
$endgroup$
– usr
2 days ago
$begingroup$
The 'fault' lies not with me, but with the dataset I am being provided with for analysis. If that dataset is imperfect, I have only limited choices - I can effectively fix the data at source (using similar code) or create a toolset that works relatively predictably across multiple datasets which was my intention here. Hope that clarifies my issue.
$endgroup$
– lb_so
yesterday
1
$begingroup$
I see. I thought this util class was supposed to be used in regular code. As a data import helper this is very useful and totally appropriate. Fix the data as it enters the system.
$endgroup$
– usr
yesterday