How to md5 a list of filepaths contained in a file?How can I do the equivalent of a Unix find / -print [under user permissions] under Mountain Lion?How do I create a Folder Action Script to tar items dropped in folder?Tool to Consolidate Data on Multiple Hard Drives and List Differences Between FilesetsHow many combinations of tags are there for a file?How Can I Convert A .TXT File To A .CSV File Without Losing Any Of The Content Of That Document?Using Tag how do I cut and parse file paths into tagsHow to properly escape spaces in the results of mdfind to use them in a for loopHow to introduce a list object that is stored in a .txt file to an AppleScript?How to make Spotlight index shell script wrapper applications?Automator: Service to move files into new subfolders based on file extension *within any selected folder*
May I use a railway velocipede on actively-used British railways?
Demographic consequences of closed loop reincarnation
Align the contents of a numerical matrix when you have minus signs
How was Luke's prosthetic hand in Episode V filmed?
Operation Unzalgo
Difference between class and struct in with regards to padding and inheritance
How long were the Apollo astronauts allowed to breathe 100% oxygen at 1 atmosphere continuously?
Did Hitler say this quote about homeschooling?
Why teach C using scanf without talking about command line arguments?
Proof that every field is perfect???
What ability modifier do I use to chuck a dead goblin?
Why are flying carpets banned while flying brooms are not?
Why do space operations use "nominal" to mean "working correctly"?
Why do the digits of a number squared follow a similar quotient?
When will the last unambiguous evidence of mankind disappear?
How do I reproduce this layout and typography?
What is a Romeo Word™?
Which GPUs to get for Mathematical Optimization (if any)?
What's a German word for »Sandbagger«?
Inscriptio Labyrinthica
Last-minute canceled work-trip mean I'll lose thousands of dollars on planned vacation
When designing an adventure, how can I ensure a continuous player experience in a setting that's likely to favor TPKs?
Manager asking me to eat breakfast from now on
How much solution to fill Paterson Universal Tank when developing film?
How to md5 a list of filepaths contained in a file?
How can I do the equivalent of a Unix find / -print [under user permissions] under Mountain Lion?How do I create a Folder Action Script to tar items dropped in folder?Tool to Consolidate Data on Multiple Hard Drives and List Differences Between FilesetsHow many combinations of tags are there for a file?How Can I Convert A .TXT File To A .CSV File Without Losing Any Of The Content Of That Document?Using Tag how do I cut and parse file paths into tagsHow to properly escape spaces in the results of mdfind to use them in a for loopHow to introduce a list object that is stored in a .txt file to an AppleScript?How to make Spotlight index shell script wrapper applications?Automator: Service to move files into new subfolders based on file extension *within any selected folder*
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a folder containing many folders containing many files. Thousands.
I can do find . -type f > ./FILE-LISTING.TXT
to create a file containing many thousands of file paths that looks like this:
./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf
How would I feed that list of filepaths into md5
to produce an output formatted like this:
9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf
NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.
Also, would it be faster to do it all in one command line, or do it in two passes (find
to create file-listing.txt, then md5
to create file-listing-md5.txt)?
macos command-line script automation
add a comment |
I have a folder containing many folders containing many files. Thousands.
I can do find . -type f > ./FILE-LISTING.TXT
to create a file containing many thousands of file paths that looks like this:
./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf
How would I feed that list of filepaths into md5
to produce an output formatted like this:
9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf
NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.
Also, would it be faster to do it all in one command line, or do it in two passes (find
to create file-listing.txt, then md5
to create file-listing-md5.txt)?
macos command-line script automation
3
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
3
What is your use case for this file?mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps.man mtree
for details.mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04
add a comment |
I have a folder containing many folders containing many files. Thousands.
I can do find . -type f > ./FILE-LISTING.TXT
to create a file containing many thousands of file paths that looks like this:
./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf
How would I feed that list of filepaths into md5
to produce an output formatted like this:
9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf
NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.
Also, would it be faster to do it all in one command line, or do it in two passes (find
to create file-listing.txt, then md5
to create file-listing-md5.txt)?
macos command-line script automation
I have a folder containing many folders containing many files. Thousands.
I can do find . -type f > ./FILE-LISTING.TXT
to create a file containing many thousands of file paths that looks like this:
./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf
How would I feed that list of filepaths into md5
to produce an output formatted like this:
9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf
NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.
Also, would it be faster to do it all in one command line, or do it in two passes (find
to create file-listing.txt, then md5
to create file-listing-md5.txt)?
macos command-line script automation
macos command-line script automation
edited Jul 10 at 11:21
bmike♦
166k46 gold badges301 silver badges646 bronze badges
166k46 gold badges301 silver badges646 bronze badges
asked Jul 10 at 9:37
EricsErics
1514 bronze badges
1514 bronze badges
3
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
3
What is your use case for this file?mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps.man mtree
for details.mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04
add a comment |
3
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
3
What is your use case for this file?mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps.man mtree
for details.mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04
3
3
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
3
3
What is your use case for this file?
mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps. man mtree
for details. mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
What is your use case for this file?
mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps. man mtree
for details. mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04
add a comment |
2 Answers
2
active
oldest
votes
find . -type f -exec /sbin/md5 -r +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
add a comment |
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.
2
find
's-exec
can also handle spaces in filenames.
– fd0
Jul 10 at 11:30
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
find . -type f -exec /sbin/md5 -r +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
add a comment |
find . -type f -exec /sbin/md5 -r +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
add a comment |
find . -type f -exec /sbin/md5 -r +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
find . -type f -exec /sbin/md5 -r +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
edited Jul 11 at 6:22
answered Jul 10 at 9:52
nohillside♦nohillside
55.1k14 gold badges116 silver badges165 bronze badges
55.1k14 gold badges116 silver badges165 bronze badges
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
add a comment |
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
4
4
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files under
time
. The find -exec
version was about 3 seconds faster than find | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).– Craig S. Cottingham
Jul 10 at 19:09
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files under
time
. The find -exec
version was about 3 seconds faster than find | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).– Craig S. Cottingham
Jul 10 at 19:09
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
This is almost certainly I/O bound (but not for printing to the console, it has to digest all those files, that's gonna take time)
– Thilo
Jul 11 at 9:51
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@CraigS.Cottingham There are about that many files, but in deeply nested directories - not just one directory - which might explain why the command line I inherited takes about 15 minutes to run. Next time I'm on site I'll do a comparison too.
– Erics
Jul 11 at 11:08
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
@Erics Simple find commands (as the one you have in the question) are purely I/O bound. When calculating MD5 hashes as well, it could be both I/O (for reading all data) or CPU (for calculating the hash), but this then depends on the hardware used.
– nohillside♦
Jul 11 at 11:15
add a comment |
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.
2
find
's-exec
can also handle spaces in filenames.
– fd0
Jul 10 at 11:30
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
add a comment |
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.
2
find
's-exec
can also handle spaces in filenames.
– fd0
Jul 10 at 11:30
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
add a comment |
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.
edited Jul 10 at 15:15
answered Jul 10 at 10:51
lhflhf
3,8804 gold badges25 silver badges28 bronze badges
3,8804 gold badges25 silver badges28 bronze badges
2
find
's-exec
can also handle spaces in filenames.
– fd0
Jul 10 at 11:30
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
add a comment |
2
find
's-exec
can also handle spaces in filenames.
– fd0
Jul 10 at 11:30
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files undertime
. Thefind -exec
version was about 3 seconds faster thanfind | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).
– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
2
2
find
's -exec
can also handle spaces in filenames.– fd0
Jul 10 at 11:30
find
's -exec
can also handle spaces in filenames.– fd0
Jul 10 at 11:30
2
2
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files under
time
. The find -exec
version was about 3 seconds faster than find | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).– Craig S. Cottingham
Jul 10 at 19:09
Both @nohillside and @lhf provided good, valid answers. On a whim, I decided to see if one is substantially better than the other. I ran both on a directory containing more than 64,000 files under
time
. The find -exec
version was about 3 seconds faster than find | xargs
. However, the run time for both was around 45 seconds, meaning that (a) the difference is less than 10% and (b) the time is probably I/O bound (printing to the console).– Craig S. Cottingham
Jul 10 at 19:09
Thanks for testing it out!
– lhf
Jul 10 at 20:10
Thanks for testing it out!
– lhf
Jul 10 at 20:10
What shell are you using?
– fd0
Jul 10 at 20:13
What shell are you using?
– fd0
Jul 10 at 20:13
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
@fd0, I'm using bash.
– lhf
Jul 10 at 21:41
add a comment |
3
This is a superb question - clear, poses a few challenges, but is going to be very doable since every tool for automation on MacOS needs to handle spaces in file names, loops and variables to handle the changing file being processed. Well done - I hope we get some great answers in python, bash, swift and other options for scripting.
– bmike♦
Jul 10 at 11:23
3
What is your use case for this file?
mtree
is a tool already available to monitor file hashes and detect changes to filenames, file contents, permissions or datestamps.man mtree
for details.mtree -c -K md5digest
– Jim L.
Jul 11 at 6:14
The use case is to hand a file of hashes and filepaths to a third party RDBMS which tracks a lot of extra detail not present in the file system. If the files get moved around, they can be re-linked. If the file is edited in place it can be re-linked.
– Erics
Jul 11 at 11:04