How to find directories containing only specific filesuse 'find' to search for directories !containing certain filetype fooFind directories without music filesHow to show distinct directories on find?Copying Only Directories With FilesFind all .php files inside directories with writable permissionslist all directories containing *.html files and also list the files in the directoriesFind files in globbed directories excluding some subpaths
Why does Windows store Wi-Fi passwords in a reversible format?
How can I download a file through 2 SSH connections?
Book with the Latin quote 'nihil superbus' meaning 'nothing above us'
Expanding powers of expressions of the form ax+b
Add 2 new columns to existing dataframe using apply
Very slow boot time and poor perfomance
"There were either twelve sexes or none."
Syntax-highlighting is getting screwed when loading long markdown files on Windows
Hangman game in Python - need feedback on the quality of code
How do proponents of Sola Scriptura address the ministry of those Apostles who authored no parts of Scripture?
Is first Ubuntu user root?
When calculating a force, why do I get different result when I try to calculate via torque vs via sum of forces at an axis?
When, exactly, does the Rogue Scout get to use their Skirmisher ability?
Semantic difference between regular and irregular 'backen'
Why did Khan ask Admiral James T. Kirk about Project Genesis?
How to gently end involvement with an online community?
Changing JPEG to RAW to use on Lightroom?
Is gzip atomic?
Why is proof-of-work required in Bitcoin?
Can an Arcane Focus be embedded in one's body?
Removal of て in Japanese novels
Ordering a list of integers
Could this kind of inaccurate sacrifice be countered?
Should I stick with American terminology in my English set young adult book?
How to find directories containing only specific files
use 'find' to search for directories !containing certain filetype fooFind directories without music filesHow to show distinct directories on find?Copying Only Directories With FilesFind all .php files inside directories with writable permissionslist all directories containing *.html files and also list the files in the directoriesFind files in globbed directories excluding some subpaths
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have several directories with useless files (like *.tmp
, desktop.ini
, Thumbs.db
, .picasa.ini
).
How to scan all drives to find directories which contain nothing but some of those files?
find
add a comment |
I have several directories with useless files (like *.tmp
, desktop.ini
, Thumbs.db
, .picasa.ini
).
How to scan all drives to find directories which contain nothing but some of those files?
find
6
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01
add a comment |
I have several directories with useless files (like *.tmp
, desktop.ini
, Thumbs.db
, .picasa.ini
).
How to scan all drives to find directories which contain nothing but some of those files?
find
I have several directories with useless files (like *.tmp
, desktop.ini
, Thumbs.db
, .picasa.ini
).
How to scan all drives to find directories which contain nothing but some of those files?
find
find
edited Aug 13 at 13:54
Christopher
11.7k4 gold badges34 silver badges52 bronze badges
11.7k4 gold badges34 silver badges52 bronze badges
asked Aug 13 at 13:10
Rami SedhomRami Sedhom
1446 bronze badges
1446 bronze badges
6
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01
add a comment |
6
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01
6
6
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01
add a comment |
6 Answers
6
active
oldest
votes
To find all directories that contain no other name than *.tmp
, desktop.ini
, Thumbs.db
, and/or .picasa.ini
:
find . -type d -exec bash -O dotglob -c '
for dirpath do
ok=true
seen_files=false
set -- "$dirpath"/*
for name do
[ -d "$name" ] && continue # skip dirs
seen_files=true
case "$name##*/" in
*.tmp|desktop.ini|Thumbs.db|.picasa.ini) ;; # do nothing
*) ok=false; break
esac
done
"$seen_files" && "$ok" && printf "%sn" "$dirpath"
done' bash +
This would use find
to locate any directories beneath the current directory (including the current directory) and pass them to a shell script.
The shell script iterates over the given directory paths, and for each, it expands *
in it (with the dotglob
shell option set in bash
to catch hidden names).
It then goes through the list of resulting names and matches them against the particular patterns and names that we'd like to find (ignoring directories). If it finds any other name that doesn't match our list, it sets ok
to false
(from having been true
) and breaks out of that inner loop.
The seen_files
variable becomes true
as soon as we've seen a file of any type other than directory (or symlink to directory). This variable helps us avoid reporting subdirectories that only contain other subdirectories.
It then runs $seen_files
and $ok
(true
or false
) and if these are both true
, which means that the directory contains at least one regular file, and only contains filenames in our list, it prints the pathname of the directory.
Instead of
set -- "$dirpath"/*
for name do
you could obviously do
for name in "$dirpath"/*; do
instead.
Testing:
$ tree
.
`-- dir
|-- Thumbs.db
`-- dir
|-- file.tmp
`-- something
2 directories, 3 files
(find
command is run here, producing the output...)
./dir
This means that the directory ./dir
only contains names in the list (ignoring directories), while ./dir/dir
contains other things as well.
If you remove [ -d "$name" ] && continue
from the code, the ./dir
directory would not have been found since it contains a name (dir
) that is not in our list.
That fails to return empty directories (you'd probably wantnullglob
as well).
– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
add a comment |
You would need to specify the files or folders, ideally by name, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty rm
will find all files (-type f
) called (-iname
) "thumbs.db" (ignoring case because of the i
in iname
) and then removing (rm
) them.
You may use filename patterns, e.g.
find $HOME -type f -iname '*.tmp' -print0 | xargs -0 --no-run-if-empty rm
Warning: Please be careful what you type, deleting may happen without asking you.
Do make regular backups - right before getting to work on your cleanup may be a good moment!
If you wish to find out what would happen look at the file list first before rm
ing anything, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty ls -l
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards andfind ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.
– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
add a comment |
Used this combination of find, xargs, ls, sed, wc and awk commands and it is working:
find . -type f ( -iname "desktop.ini" -o -name "thumb.db" ) -printf %h\0 | xargs -0 -I "" sh -c 'printf "t"; ls -l "" | sed -n "1!p" | wc -l' | awk '$2 == "1" print $0'
Explanation:
find .
find in current directory-type f
find files only( -iname "desktop.ini" -o -name "thumb.db" )
where filename is "desktop.ini" or "thumb.db" case insensitiveprintf %h\0
print leading directory of file's name + ASCII NULxargs -0 -I "" sh -c 'printf "t"; ls -l ""
print output directory and executels -l
on each onesed -n "1!p" | wc -l'
exclude first line ofls -l
which contain total files and directories and then count linesawk '$2 == "1" print $0'
print line if only count is equal to "1"
2
Never embed thein the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called
$(reboot)
for instance).
– Stéphane Chazelas
Aug 13 at 15:39
2
The first argument ofprintf
is the format, you shouldn't use variable data in there.
– Stéphane Chazelas
Aug 13 at 15:39
2
You use-printf %h\0
andxargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.
– Stéphane Chazelas
Aug 13 at 15:40
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
Note thatsed -n '1!p'
can be writtentail -n +2
.
– Stéphane Chazelas
Aug 13 at 15:44
|
show 2 more comments
With GNU find
and GNU awk
, you could have find
report all the files and awk
do the matching:
find . -depth -type d -printf '%p/' -o -printf '%p' |
gawk -F/ -v OFS=/ -v RS='' -v IGNORECASE=1 '
//$/
NF--
if (good[$0] == 0 && bad[$0] > 0) print
next
name = $NF
NF--
if (name ~ /^(.*.tmp'
If you also want to include the empty directories, remove the && bad[$0] > 0
. If if you want case sensitive matching, remove -v IGNORECASE=1
.
add a comment |
With zsh
, you can do
set -o extendedglob # for ^ and (#i)
printf '%sn' **/*(D/F^e'[()(($#)) $REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)]')
To list the directories that contain only entries matching (*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
case insensitively.
**/
: recursive glob (any level of subdirectories)*(qualifier)
: glob (here*
matching any file), with qualifiers (to match on other criteria than name).D
: enabledotglob
for that glob (include hidden files and look inside hidden dirs)./
: only select files of type directoryF
: only theF
ull ones (that contain at least one entry). Remove if you also want to list empty directories.^
: negate the following qualifierse'[code]'
: ane
valuation qualifier: select the files for which the code does not (with the previous^
) return true.() code args
: anonymous function. Here the code is(($#))
which is a ksh-style arithmetic expression which here evaluates totrue
if$#
is non-zero ($#
being the number of arguments to the anonymous function).$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)
makes up the arguments to that inline function. Here that's another glob:$REPLY
: inside thee'[code]'
that's the path to the file currently being considered.^
: negation.(#i)
: turn on case insensitive matching for the rest of the pattern.(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
: either of those, so with negation, none of those.(ND)
: another glob qualifier.N
fornullglob
(the glob expands to nothing if there's no match, so(($#))
becomes false),D
fordotglob
again. Here, as an optimisation, we could also addoN
(toN
oto
rder the list of matching files) and[1]
to only select the first as we don't need to know how many there are, only whether there are some at all.
To make it a bit more legible, we could use a function:
set -o extendedglob
has_useful_entries()
()(($#)) $1-$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(NDoN[1])
printf '%sn' **/*(D/F^+has_useful_entries)
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
add a comment |
find ~/ -type f -print0 2>/dev/null |
awk -F/ 'BEGIN RS=ORS="";
^(desktop.ini;
END for (d in seen) if (seen[d] == found[d]) print d'
This uses find
to just output a NUL-terminated list of files (and only files, -type f
) in or beneath the target directory (~/
) and pipe them into an awk script. The 2>/dev/null
is to get rid of warning messages from find
if/when the user does not have permission to descend into some sub-directories.
The awk script uses a /
as the field separator and sets both the input (RS) and output (ORS) record separators to NUL. It extracts the directory portion of the filename from the input record and keeps count of how many times that directory has been seen (using associative array seen
). Then, if the final field ($NF) matches one of the desired filename patterns, it keeps count of the matches (using associative arrray found
).
Once all the input has been processed, it prints out every directory where the number of times the directory has been seen is equal to the number of found matches for that directory.
i.e. it prints only the directories containing ONLY matching files.
Because the ORS is a NUL, the output of this can be safely used as input to xargs -0r rm -rf
or a similar command, without risk of problems due to spaces, linefeeds or other problematic shell meta-characters in the directory names.
The output can be further processed by any tool or scripting language that can work with NUL-separated input, including perl
and the GNU versions of sed
, sort
, grep
, head
, tail
, and many more. In many cases, you're probably better off either tweaking the find
options or doing extra processing in the awk
script (or just rewriting the whole thing in perl
using the File::Find
module).
BTW, if you haven't yet finalised what kind of post-processing (if any) you want to do on the directory list, redirecting the output of the find ... | awk ...
to a file is useful because the find
operation is very demanding on disk I/O - using a file as input to further processing avoids multiple runs just to provide the same input (i.e. it's a cache).
Finally, if you want to visually examine the output (e.g. to make sure you aren't going to delete anything important), change the RS=ORS=""
line to RS=""
, so you get a line-feed between each directory name. This can't be safely used as input to xargs
because there line-feeds are valid characters in unix file/directory names.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f535364%2fhow-to-find-directories-containing-only-specific-files%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
To find all directories that contain no other name than *.tmp
, desktop.ini
, Thumbs.db
, and/or .picasa.ini
:
find . -type d -exec bash -O dotglob -c '
for dirpath do
ok=true
seen_files=false
set -- "$dirpath"/*
for name do
[ -d "$name" ] && continue # skip dirs
seen_files=true
case "$name##*/" in
*.tmp|desktop.ini|Thumbs.db|.picasa.ini) ;; # do nothing
*) ok=false; break
esac
done
"$seen_files" && "$ok" && printf "%sn" "$dirpath"
done' bash +
This would use find
to locate any directories beneath the current directory (including the current directory) and pass them to a shell script.
The shell script iterates over the given directory paths, and for each, it expands *
in it (with the dotglob
shell option set in bash
to catch hidden names).
It then goes through the list of resulting names and matches them against the particular patterns and names that we'd like to find (ignoring directories). If it finds any other name that doesn't match our list, it sets ok
to false
(from having been true
) and breaks out of that inner loop.
The seen_files
variable becomes true
as soon as we've seen a file of any type other than directory (or symlink to directory). This variable helps us avoid reporting subdirectories that only contain other subdirectories.
It then runs $seen_files
and $ok
(true
or false
) and if these are both true
, which means that the directory contains at least one regular file, and only contains filenames in our list, it prints the pathname of the directory.
Instead of
set -- "$dirpath"/*
for name do
you could obviously do
for name in "$dirpath"/*; do
instead.
Testing:
$ tree
.
`-- dir
|-- Thumbs.db
`-- dir
|-- file.tmp
`-- something
2 directories, 3 files
(find
command is run here, producing the output...)
./dir
This means that the directory ./dir
only contains names in the list (ignoring directories), while ./dir/dir
contains other things as well.
If you remove [ -d "$name" ] && continue
from the code, the ./dir
directory would not have been found since it contains a name (dir
) that is not in our list.
That fails to return empty directories (you'd probably wantnullglob
as well).
– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
add a comment |
To find all directories that contain no other name than *.tmp
, desktop.ini
, Thumbs.db
, and/or .picasa.ini
:
find . -type d -exec bash -O dotglob -c '
for dirpath do
ok=true
seen_files=false
set -- "$dirpath"/*
for name do
[ -d "$name" ] && continue # skip dirs
seen_files=true
case "$name##*/" in
*.tmp|desktop.ini|Thumbs.db|.picasa.ini) ;; # do nothing
*) ok=false; break
esac
done
"$seen_files" && "$ok" && printf "%sn" "$dirpath"
done' bash +
This would use find
to locate any directories beneath the current directory (including the current directory) and pass them to a shell script.
The shell script iterates over the given directory paths, and for each, it expands *
in it (with the dotglob
shell option set in bash
to catch hidden names).
It then goes through the list of resulting names and matches them against the particular patterns and names that we'd like to find (ignoring directories). If it finds any other name that doesn't match our list, it sets ok
to false
(from having been true
) and breaks out of that inner loop.
The seen_files
variable becomes true
as soon as we've seen a file of any type other than directory (or symlink to directory). This variable helps us avoid reporting subdirectories that only contain other subdirectories.
It then runs $seen_files
and $ok
(true
or false
) and if these are both true
, which means that the directory contains at least one regular file, and only contains filenames in our list, it prints the pathname of the directory.
Instead of
set -- "$dirpath"/*
for name do
you could obviously do
for name in "$dirpath"/*; do
instead.
Testing:
$ tree
.
`-- dir
|-- Thumbs.db
`-- dir
|-- file.tmp
`-- something
2 directories, 3 files
(find
command is run here, producing the output...)
./dir
This means that the directory ./dir
only contains names in the list (ignoring directories), while ./dir/dir
contains other things as well.
If you remove [ -d "$name" ] && continue
from the code, the ./dir
directory would not have been found since it contains a name (dir
) that is not in our list.
That fails to return empty directories (you'd probably wantnullglob
as well).
– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
add a comment |
To find all directories that contain no other name than *.tmp
, desktop.ini
, Thumbs.db
, and/or .picasa.ini
:
find . -type d -exec bash -O dotglob -c '
for dirpath do
ok=true
seen_files=false
set -- "$dirpath"/*
for name do
[ -d "$name" ] && continue # skip dirs
seen_files=true
case "$name##*/" in
*.tmp|desktop.ini|Thumbs.db|.picasa.ini) ;; # do nothing
*) ok=false; break
esac
done
"$seen_files" && "$ok" && printf "%sn" "$dirpath"
done' bash +
This would use find
to locate any directories beneath the current directory (including the current directory) and pass them to a shell script.
The shell script iterates over the given directory paths, and for each, it expands *
in it (with the dotglob
shell option set in bash
to catch hidden names).
It then goes through the list of resulting names and matches them against the particular patterns and names that we'd like to find (ignoring directories). If it finds any other name that doesn't match our list, it sets ok
to false
(from having been true
) and breaks out of that inner loop.
The seen_files
variable becomes true
as soon as we've seen a file of any type other than directory (or symlink to directory). This variable helps us avoid reporting subdirectories that only contain other subdirectories.
It then runs $seen_files
and $ok
(true
or false
) and if these are both true
, which means that the directory contains at least one regular file, and only contains filenames in our list, it prints the pathname of the directory.
Instead of
set -- "$dirpath"/*
for name do
you could obviously do
for name in "$dirpath"/*; do
instead.
Testing:
$ tree
.
`-- dir
|-- Thumbs.db
`-- dir
|-- file.tmp
`-- something
2 directories, 3 files
(find
command is run here, producing the output...)
./dir
This means that the directory ./dir
only contains names in the list (ignoring directories), while ./dir/dir
contains other things as well.
If you remove [ -d "$name" ] && continue
from the code, the ./dir
directory would not have been found since it contains a name (dir
) that is not in our list.
To find all directories that contain no other name than *.tmp
, desktop.ini
, Thumbs.db
, and/or .picasa.ini
:
find . -type d -exec bash -O dotglob -c '
for dirpath do
ok=true
seen_files=false
set -- "$dirpath"/*
for name do
[ -d "$name" ] && continue # skip dirs
seen_files=true
case "$name##*/" in
*.tmp|desktop.ini|Thumbs.db|.picasa.ini) ;; # do nothing
*) ok=false; break
esac
done
"$seen_files" && "$ok" && printf "%sn" "$dirpath"
done' bash +
This would use find
to locate any directories beneath the current directory (including the current directory) and pass them to a shell script.
The shell script iterates over the given directory paths, and for each, it expands *
in it (with the dotglob
shell option set in bash
to catch hidden names).
It then goes through the list of resulting names and matches them against the particular patterns and names that we'd like to find (ignoring directories). If it finds any other name that doesn't match our list, it sets ok
to false
(from having been true
) and breaks out of that inner loop.
The seen_files
variable becomes true
as soon as we've seen a file of any type other than directory (or symlink to directory). This variable helps us avoid reporting subdirectories that only contain other subdirectories.
It then runs $seen_files
and $ok
(true
or false
) and if these are both true
, which means that the directory contains at least one regular file, and only contains filenames in our list, it prints the pathname of the directory.
Instead of
set -- "$dirpath"/*
for name do
you could obviously do
for name in "$dirpath"/*; do
instead.
Testing:
$ tree
.
`-- dir
|-- Thumbs.db
`-- dir
|-- file.tmp
`-- something
2 directories, 3 files
(find
command is run here, producing the output...)
./dir
This means that the directory ./dir
only contains names in the list (ignoring directories), while ./dir/dir
contains other things as well.
If you remove [ -d "$name" ] && continue
from the code, the ./dir
directory would not have been found since it contains a name (dir
) that is not in our list.
edited Aug 14 at 7:58
Stéphane Chazelas
331k58 gold badges647 silver badges1017 bronze badges
331k58 gold badges647 silver badges1017 bronze badges
answered Aug 13 at 14:37
Kusalananda♦Kusalananda
161k18 gold badges318 silver badges505 bronze badges
161k18 gold badges318 silver badges505 bronze badges
That fails to return empty directories (you'd probably wantnullglob
as well).
– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
add a comment |
That fails to return empty directories (you'd probably wantnullglob
as well).
– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
That fails to return empty directories (you'd probably want
nullglob
as well).– Stéphane Chazelas
Aug 13 at 15:01
That fails to return empty directories (you'd probably want
nullglob
as well).– Stéphane Chazelas
Aug 13 at 15:01
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
That returns directories that have only subdirs which I doubt is what the OP wants.
– Stéphane Chazelas
Aug 13 at 15:02
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
@StéphaneChazelas Empty directories do not have any files matching any of those names, so that's ok. The second issue will be fixed later in the evening (I'm busy).
– Kusalananda♦
Aug 13 at 15:15
2
2
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
But they contain nothing but some of those files. OK, the requirements are a bit ambiguous, and I suppose your interpretation makes more sense than mine. I've updated my answer with both options.
– Stéphane Chazelas
Aug 13 at 15:18
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
@StéphaneChazelas Worked around the subdirs-only issue.
– Kusalananda♦
Aug 13 at 20:20
add a comment |
You would need to specify the files or folders, ideally by name, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty rm
will find all files (-type f
) called (-iname
) "thumbs.db" (ignoring case because of the i
in iname
) and then removing (rm
) them.
You may use filename patterns, e.g.
find $HOME -type f -iname '*.tmp' -print0 | xargs -0 --no-run-if-empty rm
Warning: Please be careful what you type, deleting may happen without asking you.
Do make regular backups - right before getting to work on your cleanup may be a good moment!
If you wish to find out what would happen look at the file list first before rm
ing anything, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty ls -l
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards andfind ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.
– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
add a comment |
You would need to specify the files or folders, ideally by name, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty rm
will find all files (-type f
) called (-iname
) "thumbs.db" (ignoring case because of the i
in iname
) and then removing (rm
) them.
You may use filename patterns, e.g.
find $HOME -type f -iname '*.tmp' -print0 | xargs -0 --no-run-if-empty rm
Warning: Please be careful what you type, deleting may happen without asking you.
Do make regular backups - right before getting to work on your cleanup may be a good moment!
If you wish to find out what would happen look at the file list first before rm
ing anything, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty ls -l
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards andfind ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.
– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
add a comment |
You would need to specify the files or folders, ideally by name, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty rm
will find all files (-type f
) called (-iname
) "thumbs.db" (ignoring case because of the i
in iname
) and then removing (rm
) them.
You may use filename patterns, e.g.
find $HOME -type f -iname '*.tmp' -print0 | xargs -0 --no-run-if-empty rm
Warning: Please be careful what you type, deleting may happen without asking you.
Do make regular backups - right before getting to work on your cleanup may be a good moment!
If you wish to find out what would happen look at the file list first before rm
ing anything, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty ls -l
You would need to specify the files or folders, ideally by name, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty rm
will find all files (-type f
) called (-iname
) "thumbs.db" (ignoring case because of the i
in iname
) and then removing (rm
) them.
You may use filename patterns, e.g.
find $HOME -type f -iname '*.tmp' -print0 | xargs -0 --no-run-if-empty rm
Warning: Please be careful what you type, deleting may happen without asking you.
Do make regular backups - right before getting to work on your cleanup may be a good moment!
If you wish to find out what would happen look at the file list first before rm
ing anything, like:
find $HOME -type f -iname thumbs.db -print0 | xargs -0 --no-run-if-empty ls -l
edited Aug 13 at 13:32
answered Aug 13 at 13:14
Ned64Ned64
3,2391 gold badge16 silver badges43 bronze badges
3,2391 gold badge16 silver badges43 bronze badges
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards andfind ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.
– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
add a comment |
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards andfind ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.
– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
It just list all paths/thumb.db but those paths may contain other files. I just need to list directories that contain only this file and nothing else.
– Rami Sedhom
Aug 13 at 13:37
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
So for every directory out of find, it may count files & sub-directories, if it's 1, then print, otherwise ignore.
– Rami Sedhom
Aug 13 at 13:38
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
Your code will find the pathnames to those files (and delete them, even though this was not part of the question), but it does not actually find the directories that contains only these files.
– Kusalananda♦
Aug 13 at 14:18
@Kusalananda I did not realise it was that important. I would personally go in afterwards and
find ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.– Ned64
Aug 13 at 14:22
@Kusalananda I did not realise it was that important. I would personally go in afterwards and
find ~ -type d -print0 | xargs -0 --no-run rmdir -p
but only (of course) if there are no other empty folders.– Ned64
Aug 13 at 14:22
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
@Ned64 thanks for your answer, it inspired me to find my answer :)
– Rami Sedhom
Aug 13 at 14:32
add a comment |
Used this combination of find, xargs, ls, sed, wc and awk commands and it is working:
find . -type f ( -iname "desktop.ini" -o -name "thumb.db" ) -printf %h\0 | xargs -0 -I "" sh -c 'printf "t"; ls -l "" | sed -n "1!p" | wc -l' | awk '$2 == "1" print $0'
Explanation:
find .
find in current directory-type f
find files only( -iname "desktop.ini" -o -name "thumb.db" )
where filename is "desktop.ini" or "thumb.db" case insensitiveprintf %h\0
print leading directory of file's name + ASCII NULxargs -0 -I "" sh -c 'printf "t"; ls -l ""
print output directory and executels -l
on each onesed -n "1!p" | wc -l'
exclude first line ofls -l
which contain total files and directories and then count linesawk '$2 == "1" print $0'
print line if only count is equal to "1"
2
Never embed thein the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called
$(reboot)
for instance).
– Stéphane Chazelas
Aug 13 at 15:39
2
The first argument ofprintf
is the format, you shouldn't use variable data in there.
– Stéphane Chazelas
Aug 13 at 15:39
2
You use-printf %h\0
andxargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.
– Stéphane Chazelas
Aug 13 at 15:40
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
Note thatsed -n '1!p'
can be writtentail -n +2
.
– Stéphane Chazelas
Aug 13 at 15:44
|
show 2 more comments
Used this combination of find, xargs, ls, sed, wc and awk commands and it is working:
find . -type f ( -iname "desktop.ini" -o -name "thumb.db" ) -printf %h\0 | xargs -0 -I "" sh -c 'printf "t"; ls -l "" | sed -n "1!p" | wc -l' | awk '$2 == "1" print $0'
Explanation:
find .
find in current directory-type f
find files only( -iname "desktop.ini" -o -name "thumb.db" )
where filename is "desktop.ini" or "thumb.db" case insensitiveprintf %h\0
print leading directory of file's name + ASCII NULxargs -0 -I "" sh -c 'printf "t"; ls -l ""
print output directory and executels -l
on each onesed -n "1!p" | wc -l'
exclude first line ofls -l
which contain total files and directories and then count linesawk '$2 == "1" print $0'
print line if only count is equal to "1"
2
Never embed thein the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called
$(reboot)
for instance).
– Stéphane Chazelas
Aug 13 at 15:39
2
The first argument ofprintf
is the format, you shouldn't use variable data in there.
– Stéphane Chazelas
Aug 13 at 15:39
2
You use-printf %h\0
andxargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.
– Stéphane Chazelas
Aug 13 at 15:40
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
Note thatsed -n '1!p'
can be writtentail -n +2
.
– Stéphane Chazelas
Aug 13 at 15:44
|
show 2 more comments
Used this combination of find, xargs, ls, sed, wc and awk commands and it is working:
find . -type f ( -iname "desktop.ini" -o -name "thumb.db" ) -printf %h\0 | xargs -0 -I "" sh -c 'printf "t"; ls -l "" | sed -n "1!p" | wc -l' | awk '$2 == "1" print $0'
Explanation:
find .
find in current directory-type f
find files only( -iname "desktop.ini" -o -name "thumb.db" )
where filename is "desktop.ini" or "thumb.db" case insensitiveprintf %h\0
print leading directory of file's name + ASCII NULxargs -0 -I "" sh -c 'printf "t"; ls -l ""
print output directory and executels -l
on each onesed -n "1!p" | wc -l'
exclude first line ofls -l
which contain total files and directories and then count linesawk '$2 == "1" print $0'
print line if only count is equal to "1"
Used this combination of find, xargs, ls, sed, wc and awk commands and it is working:
find . -type f ( -iname "desktop.ini" -o -name "thumb.db" ) -printf %h\0 | xargs -0 -I "" sh -c 'printf "t"; ls -l "" | sed -n "1!p" | wc -l' | awk '$2 == "1" print $0'
Explanation:
find .
find in current directory-type f
find files only( -iname "desktop.ini" -o -name "thumb.db" )
where filename is "desktop.ini" or "thumb.db" case insensitiveprintf %h\0
print leading directory of file's name + ASCII NULxargs -0 -I "" sh -c 'printf "t"; ls -l ""
print output directory and executels -l
on each onesed -n "1!p" | wc -l'
exclude first line ofls -l
which contain total files and directories and then count linesawk '$2 == "1" print $0'
print line if only count is equal to "1"
edited Aug 13 at 14:26
answered Aug 13 at 13:57
Rami SedhomRami Sedhom
1446 bronze badges
1446 bronze badges
2
Never embed thein the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called
$(reboot)
for instance).
– Stéphane Chazelas
Aug 13 at 15:39
2
The first argument ofprintf
is the format, you shouldn't use variable data in there.
– Stéphane Chazelas
Aug 13 at 15:39
2
You use-printf %h\0
andxargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.
– Stéphane Chazelas
Aug 13 at 15:40
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
Note thatsed -n '1!p'
can be writtentail -n +2
.
– Stéphane Chazelas
Aug 13 at 15:44
|
show 2 more comments
2
Never embed thein the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called
$(reboot)
for instance).
– Stéphane Chazelas
Aug 13 at 15:39
2
The first argument ofprintf
is the format, you shouldn't use variable data in there.
– Stéphane Chazelas
Aug 13 at 15:39
2
You use-printf %h\0
andxargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.
– Stéphane Chazelas
Aug 13 at 15:40
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
Note thatsed -n '1!p'
can be writtentail -n +2
.
– Stéphane Chazelas
Aug 13 at 15:44
2
2
Never embed the
in the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called $(reboot)
for instance).– Stéphane Chazelas
Aug 13 at 15:39
Never embed the
in the shell code, that's very dangerous and makes it an arbitrary code injection vulnerability and is not portable (think of a directory called $(reboot)
for instance).– Stéphane Chazelas
Aug 13 at 15:39
2
2
The first argument of
printf
is the format, you shouldn't use variable data in there.– Stéphane Chazelas
Aug 13 at 15:39
The first argument of
printf
is the format, you shouldn't use variable data in there.– Stéphane Chazelas
Aug 13 at 15:39
2
2
You use
-printf %h\0
and xargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.– Stéphane Chazelas
Aug 13 at 15:40
You use
-printf %h\0
and xargs -0
(GNU extensions btw), but they treat file names as if they were lines or words.– Stéphane Chazelas
Aug 13 at 15:40
2
2
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
Effectively, it seems the intention of that code is to report directories that contain only one entry and that one entry being either desktop.ini or thumbs.db which is different from your requirements in your question.
– Stéphane Chazelas
Aug 13 at 15:42
2
2
Note that
sed -n '1!p'
can be written tail -n +2
.– Stéphane Chazelas
Aug 13 at 15:44
Note that
sed -n '1!p'
can be written tail -n +2
.– Stéphane Chazelas
Aug 13 at 15:44
|
show 2 more comments
With GNU find
and GNU awk
, you could have find
report all the files and awk
do the matching:
find . -depth -type d -printf '%p/' -o -printf '%p' |
gawk -F/ -v OFS=/ -v RS='' -v IGNORECASE=1 '
//$/
NF--
if (good[$0] == 0 && bad[$0] > 0) print
next
name = $NF
NF--
if (name ~ /^(.*.tmp'
If you also want to include the empty directories, remove the && bad[$0] > 0
. If if you want case sensitive matching, remove -v IGNORECASE=1
.
add a comment |
With GNU find
and GNU awk
, you could have find
report all the files and awk
do the matching:
find . -depth -type d -printf '%p/' -o -printf '%p' |
gawk -F/ -v OFS=/ -v RS='' -v IGNORECASE=1 '
//$/
NF--
if (good[$0] == 0 && bad[$0] > 0) print
next
name = $NF
NF--
if (name ~ /^(.*.tmp'
If you also want to include the empty directories, remove the && bad[$0] > 0
. If if you want case sensitive matching, remove -v IGNORECASE=1
.
add a comment |
With GNU find
and GNU awk
, you could have find
report all the files and awk
do the matching:
find . -depth -type d -printf '%p/' -o -printf '%p' |
gawk -F/ -v OFS=/ -v RS='' -v IGNORECASE=1 '
//$/
NF--
if (good[$0] == 0 && bad[$0] > 0) print
next
name = $NF
NF--
if (name ~ /^(.*.tmp'
If you also want to include the empty directories, remove the && bad[$0] > 0
. If if you want case sensitive matching, remove -v IGNORECASE=1
.
With GNU find
and GNU awk
, you could have find
report all the files and awk
do the matching:
find . -depth -type d -printf '%p/' -o -printf '%p' |
gawk -F/ -v OFS=/ -v RS='' -v IGNORECASE=1 '
//$/
NF--
if (good[$0] == 0 && bad[$0] > 0) print
next
name = $NF
NF--
if (name ~ /^(.*.tmp'
If you also want to include the empty directories, remove the && bad[$0] > 0
. If if you want case sensitive matching, remove -v IGNORECASE=1
.
edited Aug 14 at 7:11
answered Aug 13 at 19:19
Stéphane ChazelasStéphane Chazelas
331k58 gold badges647 silver badges1017 bronze badges
331k58 gold badges647 silver badges1017 bronze badges
add a comment |
add a comment |
With zsh
, you can do
set -o extendedglob # for ^ and (#i)
printf '%sn' **/*(D/F^e'[()(($#)) $REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)]')
To list the directories that contain only entries matching (*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
case insensitively.
**/
: recursive glob (any level of subdirectories)*(qualifier)
: glob (here*
matching any file), with qualifiers (to match on other criteria than name).D
: enabledotglob
for that glob (include hidden files and look inside hidden dirs)./
: only select files of type directoryF
: only theF
ull ones (that contain at least one entry). Remove if you also want to list empty directories.^
: negate the following qualifierse'[code]'
: ane
valuation qualifier: select the files for which the code does not (with the previous^
) return true.() code args
: anonymous function. Here the code is(($#))
which is a ksh-style arithmetic expression which here evaluates totrue
if$#
is non-zero ($#
being the number of arguments to the anonymous function).$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)
makes up the arguments to that inline function. Here that's another glob:$REPLY
: inside thee'[code]'
that's the path to the file currently being considered.^
: negation.(#i)
: turn on case insensitive matching for the rest of the pattern.(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
: either of those, so with negation, none of those.(ND)
: another glob qualifier.N
fornullglob
(the glob expands to nothing if there's no match, so(($#))
becomes false),D
fordotglob
again. Here, as an optimisation, we could also addoN
(toN
oto
rder the list of matching files) and[1]
to only select the first as we don't need to know how many there are, only whether there are some at all.
To make it a bit more legible, we could use a function:
set -o extendedglob
has_useful_entries()
()(($#)) $1-$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(NDoN[1])
printf '%sn' **/*(D/F^+has_useful_entries)
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
add a comment |
With zsh
, you can do
set -o extendedglob # for ^ and (#i)
printf '%sn' **/*(D/F^e'[()(($#)) $REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)]')
To list the directories that contain only entries matching (*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
case insensitively.
**/
: recursive glob (any level of subdirectories)*(qualifier)
: glob (here*
matching any file), with qualifiers (to match on other criteria than name).D
: enabledotglob
for that glob (include hidden files and look inside hidden dirs)./
: only select files of type directoryF
: only theF
ull ones (that contain at least one entry). Remove if you also want to list empty directories.^
: negate the following qualifierse'[code]'
: ane
valuation qualifier: select the files for which the code does not (with the previous^
) return true.() code args
: anonymous function. Here the code is(($#))
which is a ksh-style arithmetic expression which here evaluates totrue
if$#
is non-zero ($#
being the number of arguments to the anonymous function).$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)
makes up the arguments to that inline function. Here that's another glob:$REPLY
: inside thee'[code]'
that's the path to the file currently being considered.^
: negation.(#i)
: turn on case insensitive matching for the rest of the pattern.(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
: either of those, so with negation, none of those.(ND)
: another glob qualifier.N
fornullglob
(the glob expands to nothing if there's no match, so(($#))
becomes false),D
fordotglob
again. Here, as an optimisation, we could also addoN
(toN
oto
rder the list of matching files) and[1]
to only select the first as we don't need to know how many there are, only whether there are some at all.
To make it a bit more legible, we could use a function:
set -o extendedglob
has_useful_entries()
()(($#)) $1-$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(NDoN[1])
printf '%sn' **/*(D/F^+has_useful_entries)
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
add a comment |
With zsh
, you can do
set -o extendedglob # for ^ and (#i)
printf '%sn' **/*(D/F^e'[()(($#)) $REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)]')
To list the directories that contain only entries matching (*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
case insensitively.
**/
: recursive glob (any level of subdirectories)*(qualifier)
: glob (here*
matching any file), with qualifiers (to match on other criteria than name).D
: enabledotglob
for that glob (include hidden files and look inside hidden dirs)./
: only select files of type directoryF
: only theF
ull ones (that contain at least one entry). Remove if you also want to list empty directories.^
: negate the following qualifierse'[code]'
: ane
valuation qualifier: select the files for which the code does not (with the previous^
) return true.() code args
: anonymous function. Here the code is(($#))
which is a ksh-style arithmetic expression which here evaluates totrue
if$#
is non-zero ($#
being the number of arguments to the anonymous function).$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)
makes up the arguments to that inline function. Here that's another glob:$REPLY
: inside thee'[code]'
that's the path to the file currently being considered.^
: negation.(#i)
: turn on case insensitive matching for the rest of the pattern.(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
: either of those, so with negation, none of those.(ND)
: another glob qualifier.N
fornullglob
(the glob expands to nothing if there's no match, so(($#))
becomes false),D
fordotglob
again. Here, as an optimisation, we could also addoN
(toN
oto
rder the list of matching files) and[1]
to only select the first as we don't need to know how many there are, only whether there are some at all.
To make it a bit more legible, we could use a function:
set -o extendedglob
has_useful_entries()
()(($#)) $1-$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(NDoN[1])
printf '%sn' **/*(D/F^+has_useful_entries)
With zsh
, you can do
set -o extendedglob # for ^ and (#i)
printf '%sn' **/*(D/F^e'[()(($#)) $REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)]')
To list the directories that contain only entries matching (*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
case insensitively.
**/
: recursive glob (any level of subdirectories)*(qualifier)
: glob (here*
matching any file), with qualifiers (to match on other criteria than name).D
: enabledotglob
for that glob (include hidden files and look inside hidden dirs)./
: only select files of type directoryF
: only theF
ull ones (that contain at least one entry). Remove if you also want to list empty directories.^
: negate the following qualifierse'[code]'
: ane
valuation qualifier: select the files for which the code does not (with the previous^
) return true.() code args
: anonymous function. Here the code is(($#))
which is a ksh-style arithmetic expression which here evaluates totrue
if$#
is non-zero ($#
being the number of arguments to the anonymous function).$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(ND)
makes up the arguments to that inline function. Here that's another glob:$REPLY
: inside thee'[code]'
that's the path to the file currently being considered.^
: negation.(#i)
: turn on case insensitive matching for the rest of the pattern.(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)
: either of those, so with negation, none of those.(ND)
: another glob qualifier.N
fornullglob
(the glob expands to nothing if there's no match, so(($#))
becomes false),D
fordotglob
again. Here, as an optimisation, we could also addoN
(toN
oto
rder the list of matching files) and[1]
to only select the first as we don't need to know how many there are, only whether there are some at all.
To make it a bit more legible, we could use a function:
set -o extendedglob
has_useful_entries()
()(($#)) $1-$REPLY/^(#i)(*.tmp|desktop.ini|Thumbs.db|.picasa.ini)(NDoN[1])
printf '%sn' **/*(D/F^+has_useful_entries)
edited Aug 14 at 8:12
answered Aug 13 at 14:37
Stéphane ChazelasStéphane Chazelas
331k58 gold badges647 silver badges1017 bronze badges
331k58 gold badges647 silver badges1017 bronze badges
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
add a comment |
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
would you add a little explanation
– Rami Sedhom
Aug 13 at 14:43
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
@RamiSedhom, see edit.
– Stéphane Chazelas
Aug 13 at 14:57
add a comment |
find ~/ -type f -print0 2>/dev/null |
awk -F/ 'BEGIN RS=ORS="";
^(desktop.ini;
END for (d in seen) if (seen[d] == found[d]) print d'
This uses find
to just output a NUL-terminated list of files (and only files, -type f
) in or beneath the target directory (~/
) and pipe them into an awk script. The 2>/dev/null
is to get rid of warning messages from find
if/when the user does not have permission to descend into some sub-directories.
The awk script uses a /
as the field separator and sets both the input (RS) and output (ORS) record separators to NUL. It extracts the directory portion of the filename from the input record and keeps count of how many times that directory has been seen (using associative array seen
). Then, if the final field ($NF) matches one of the desired filename patterns, it keeps count of the matches (using associative arrray found
).
Once all the input has been processed, it prints out every directory where the number of times the directory has been seen is equal to the number of found matches for that directory.
i.e. it prints only the directories containing ONLY matching files.
Because the ORS is a NUL, the output of this can be safely used as input to xargs -0r rm -rf
or a similar command, without risk of problems due to spaces, linefeeds or other problematic shell meta-characters in the directory names.
The output can be further processed by any tool or scripting language that can work with NUL-separated input, including perl
and the GNU versions of sed
, sort
, grep
, head
, tail
, and many more. In many cases, you're probably better off either tweaking the find
options or doing extra processing in the awk
script (or just rewriting the whole thing in perl
using the File::Find
module).
BTW, if you haven't yet finalised what kind of post-processing (if any) you want to do on the directory list, redirecting the output of the find ... | awk ...
to a file is useful because the find
operation is very demanding on disk I/O - using a file as input to further processing avoids multiple runs just to provide the same input (i.e. it's a cache).
Finally, if you want to visually examine the output (e.g. to make sure you aren't going to delete anything important), change the RS=ORS=""
line to RS=""
, so you get a line-feed between each directory name. This can't be safely used as input to xargs
because there line-feeds are valid characters in unix file/directory names.
add a comment |
find ~/ -type f -print0 2>/dev/null |
awk -F/ 'BEGIN RS=ORS="";
^(desktop.ini;
END for (d in seen) if (seen[d] == found[d]) print d'
This uses find
to just output a NUL-terminated list of files (and only files, -type f
) in or beneath the target directory (~/
) and pipe them into an awk script. The 2>/dev/null
is to get rid of warning messages from find
if/when the user does not have permission to descend into some sub-directories.
The awk script uses a /
as the field separator and sets both the input (RS) and output (ORS) record separators to NUL. It extracts the directory portion of the filename from the input record and keeps count of how many times that directory has been seen (using associative array seen
). Then, if the final field ($NF) matches one of the desired filename patterns, it keeps count of the matches (using associative arrray found
).
Once all the input has been processed, it prints out every directory where the number of times the directory has been seen is equal to the number of found matches for that directory.
i.e. it prints only the directories containing ONLY matching files.
Because the ORS is a NUL, the output of this can be safely used as input to xargs -0r rm -rf
or a similar command, without risk of problems due to spaces, linefeeds or other problematic shell meta-characters in the directory names.
The output can be further processed by any tool or scripting language that can work with NUL-separated input, including perl
and the GNU versions of sed
, sort
, grep
, head
, tail
, and many more. In many cases, you're probably better off either tweaking the find
options or doing extra processing in the awk
script (or just rewriting the whole thing in perl
using the File::Find
module).
BTW, if you haven't yet finalised what kind of post-processing (if any) you want to do on the directory list, redirecting the output of the find ... | awk ...
to a file is useful because the find
operation is very demanding on disk I/O - using a file as input to further processing avoids multiple runs just to provide the same input (i.e. it's a cache).
Finally, if you want to visually examine the output (e.g. to make sure you aren't going to delete anything important), change the RS=ORS=""
line to RS=""
, so you get a line-feed between each directory name. This can't be safely used as input to xargs
because there line-feeds are valid characters in unix file/directory names.
add a comment |
find ~/ -type f -print0 2>/dev/null |
awk -F/ 'BEGIN RS=ORS="";
^(desktop.ini;
END for (d in seen) if (seen[d] == found[d]) print d'
This uses find
to just output a NUL-terminated list of files (and only files, -type f
) in or beneath the target directory (~/
) and pipe them into an awk script. The 2>/dev/null
is to get rid of warning messages from find
if/when the user does not have permission to descend into some sub-directories.
The awk script uses a /
as the field separator and sets both the input (RS) and output (ORS) record separators to NUL. It extracts the directory portion of the filename from the input record and keeps count of how many times that directory has been seen (using associative array seen
). Then, if the final field ($NF) matches one of the desired filename patterns, it keeps count of the matches (using associative arrray found
).
Once all the input has been processed, it prints out every directory where the number of times the directory has been seen is equal to the number of found matches for that directory.
i.e. it prints only the directories containing ONLY matching files.
Because the ORS is a NUL, the output of this can be safely used as input to xargs -0r rm -rf
or a similar command, without risk of problems due to spaces, linefeeds or other problematic shell meta-characters in the directory names.
The output can be further processed by any tool or scripting language that can work with NUL-separated input, including perl
and the GNU versions of sed
, sort
, grep
, head
, tail
, and many more. In many cases, you're probably better off either tweaking the find
options or doing extra processing in the awk
script (or just rewriting the whole thing in perl
using the File::Find
module).
BTW, if you haven't yet finalised what kind of post-processing (if any) you want to do on the directory list, redirecting the output of the find ... | awk ...
to a file is useful because the find
operation is very demanding on disk I/O - using a file as input to further processing avoids multiple runs just to provide the same input (i.e. it's a cache).
Finally, if you want to visually examine the output (e.g. to make sure you aren't going to delete anything important), change the RS=ORS=""
line to RS=""
, so you get a line-feed between each directory name. This can't be safely used as input to xargs
because there line-feeds are valid characters in unix file/directory names.
find ~/ -type f -print0 2>/dev/null |
awk -F/ 'BEGIN RS=ORS="";
^(desktop.ini;
END for (d in seen) if (seen[d] == found[d]) print d'
This uses find
to just output a NUL-terminated list of files (and only files, -type f
) in or beneath the target directory (~/
) and pipe them into an awk script. The 2>/dev/null
is to get rid of warning messages from find
if/when the user does not have permission to descend into some sub-directories.
The awk script uses a /
as the field separator and sets both the input (RS) and output (ORS) record separators to NUL. It extracts the directory portion of the filename from the input record and keeps count of how many times that directory has been seen (using associative array seen
). Then, if the final field ($NF) matches one of the desired filename patterns, it keeps count of the matches (using associative arrray found
).
Once all the input has been processed, it prints out every directory where the number of times the directory has been seen is equal to the number of found matches for that directory.
i.e. it prints only the directories containing ONLY matching files.
Because the ORS is a NUL, the output of this can be safely used as input to xargs -0r rm -rf
or a similar command, without risk of problems due to spaces, linefeeds or other problematic shell meta-characters in the directory names.
The output can be further processed by any tool or scripting language that can work with NUL-separated input, including perl
and the GNU versions of sed
, sort
, grep
, head
, tail
, and many more. In many cases, you're probably better off either tweaking the find
options or doing extra processing in the awk
script (or just rewriting the whole thing in perl
using the File::Find
module).
BTW, if you haven't yet finalised what kind of post-processing (if any) you want to do on the directory list, redirecting the output of the find ... | awk ...
to a file is useful because the find
operation is very demanding on disk I/O - using a file as input to further processing avoids multiple runs just to provide the same input (i.e. it's a cache).
Finally, if you want to visually examine the output (e.g. to make sure you aren't going to delete anything important), change the RS=ORS=""
line to RS=""
, so you get a line-feed between each directory name. This can't be safely used as input to xargs
because there line-feeds are valid characters in unix file/directory names.
answered Aug 14 at 4:39
cascas
41.8k4 gold badges59 silver badges111 bronze badges
41.8k4 gold badges59 silver badges111 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f535364%2fhow-to-find-directories-containing-only-specific-files%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
6
Do you actually need to find the directories, or are you just going to delete those useless files and the directories? In the second case, you could just first delete the useless files, and then delete all now-empty directories. (Which would of course also remove any directories that were empty to begin with.)
– ilkkachu
Aug 13 at 15:01