Python List Directory, Subdirectory, and Files - Stack Overflow
Python List Directory, Subdirectory, and Files - Stack Overflow
Python List Directory, Subdirectory, and Files - Stack Overflow
I'm trying to make a script to list all directory, subdirectory, and files in a given directory.
I tried this:
188
import sys, os
root = "/home/patate/directory/"
path = os.path.join(root, "targetdirectory")
55
for r, d, f in os.walk(path):
for file in f:
print(os.path.join(root, file))
/home/patate/directory/targetdirectory/123/456/789/file.txt
It would print:
/home/patate/directory/targetdirectory/file.txt
What I need is the first result. Any help would be greatly appreciated! Thanks.
Share Improve this question edited Mar 19 at 8:07 asked May 26, 2010 at 3:38
Follow Martin Thoma thomytheyon
110k 145 557 858 1,929 2 13 6
Sorted by:
Trending sort available
12 Answers
Highest score (default)
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 1/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
Note the usage of path and not root in the concatenation, since using root would be
incorrect.
In Python 3.4, the pathlib module was added for easier path manipulations. So the equivalent
to os.path.join would be:
pathlib.PurePath(path, name)
The advantage of pathlib is that you can use a variety of useful methods on paths. If you use
the concrete Path variant you can also do actual OS calls through them, like changing into a
directory, deleting the path, opening the file it points to and much more.
Share Improve this answer Follow edited Dec 14, 2020 at 11:38 answered May 26, 2010 at 3:46
Rui Vieira Eli Bendersky
5,185 5 41 54 249k 86 342 405
1 this is the one and only useful answer for the many questions that have been asked concerning "how to
get all files recursively in python". – harrisonfooord Oct 12, 2018 at 14:37
1 comprehension list: all_files = [os.path.join(path, name) for name in files for path, subdirs, files in
os.walk(folder)] – Nir Aug 12, 2019 at 14:40
In Python3 use parenthesis for print function print(os.path.join(path, name)) – Ehsan Aug 7,
2020 at 11:31
Just in case... Getting all files in the directory and subdirectories matching some pattern (*.py
for example):
67
import os
from fnmatch import fnmatch
root = '/some/directory'
pattern = "*.py"
Share Improve this answer Follow edited Mar 29 at 18:25 answered Nov 4, 2012 at 0:38
Ivan Pirog
2,696 1 16 7
In Python3 use parenthesis for print function print(os.path.join(path, name)) . You can also use
print(pathlib.PurePath(path, name)) . – Ahmad Ismail Jul 5, 2021 at 9:28
Join Stack Overflow to find the best answer to your technical question, help others
Couldn't comment so writing answer here. This is the clearest one-line I have seen:
Sign up
answer theirs.
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 2/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
23 import os
[os.path.join(path, name) for path, subdirs, files in os.walk(root) for name in
files]
1 this is the answer for all you googlers – Matt Feb 23 at 9:58
Here is a one-liner:
12 import os
The outer most val for sublist in ... loop flattens the list to be one dimensional. The j
loop collects a list of every file basename and joins it to the current path. Finally, the i loop
iterates over all directories and sub directories.
This example uses the hard-coded path ./ in the os.walk(...) call, you can supplement any
path string you like.
Note: os.path.expanduser and/or os.path.expandvars can be used for paths strings like ~/
Share Improve this answer Follow edited May 20, 2015 at 16:15 answered Sep 26, 2014 at 21:03
ThorSummoner
14.6k 13 126 138
It does work, but to excluve .git directoy you need to check if '.git' is NOT into the path. – Roman Rdgz
May 19, 2015 at 10:32
Join Stack Overflow to find the best answer to your technical question, help others
Sign up
Yep. Should be if '.git' not in i[0].split('/')] – Roman Rdgz May 19, 2015 at 15:24
answer theirs.
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 3/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
I would recommend os.walk over a manual dirlisting loop, generators are great, go use them.
– ThorSummoner Dec 4, 2016 at 19:18
5 import os
from itertools import product, chain
Share Improve this answer Follow edited Apr 25, 2020 at 8:47 answered Feb 21, 2018 at 8:44
Jean-François Fabre ♦ Daniel
132k 23 126 197 76 1 2
how do I list each file ? – Aakash Gupta Sep 22, 2021 at 4:43
You can take a look at this sample I made. It uses the os.path.walk function which is
deprecated beware.Uses a list to store all the filepaths
4
root = "Your root directory"
ex = ".txt"
where_to = "Wherever you wanna write your file to"
def fileWalker(ext,dirname,names):
'''
checks files in names'''
pat = "*" + ext[0]
for f in names:
if fnmatch.fnmatch(f,pat):
ext[1].append(os.path.join(dirname,f))
def writeTo(fList):
with open(where_to,"w") as f:
for di_r in fList:
f.write(di_r + "\n")
if __name__ == '__main__':
li = []
os.path.walk(root,fileWalker,[ex,li])
writeTo(li)
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 4/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
Since every example here is just using walk (with join ), i'd like to show a nice example and
comparison with listdir :
4
import os, time
def listFiles4(root): # walk/join (takes ~1.6x as long) (and uses '\\' instead)
allFiles = []
for folder, folders, files in os.walk(root):
for file in files: allFiles+=[os.path.join(folder,file)]
return allFiles
start = time.time()
for i in range(100): files = listFiles1("src") # listdir
print("Time taken: %.2fs"%(time.time()-start)) # 0.28s
start = time.time()
for i in range(100): files = listFiles2("src") # listdir and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.38s
start = time.time()
for i in range(100): files = listFiles3("src") # walk
print("Time taken: %.2fs"%(time.time()-start)) # 0.42s
start = time.time()
for i in range(100): files = listFiles4("src") # walk and join
print("Time taken: %.2fs"%(time.time()-start)) # 0.47s
So as you can see for yourself, the listdir version is much more efficient. (and that join is
slow)
Join Stack
ShareOverflow to find
Improve this the Follow
answer best answer to your
edited Feb 1,technical question, help
2019 at 7:13 othersFeb 1, 2019 at 6:37
answered Sign up
answer theirs.
Puddle
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 5/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
2,635 1 17 32
It's just an addition, with this you can get the data into CSV format
1 import sys,os
try:
import pandas as pd
except:
os.system("pip3 install pandas")
Another option would be using the glob module from the standard lib:
1 import glob
path = "/home/patate/directory/targetdirectory/**"
Share Improve this answer Follow answered Nov 11, 2021 at 23:25
Rotareti
42k 18 102 102
Using any supported Python version (3.4+), you should use pathlib.rglob to recusrively list
the contents of the current directory and all subdirectories:
1
Join Stack Overflow to find the best answer to your technical question, help others
from pathlib import Path Sign up
answer theirs.
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 6/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
Example
Folder structure:
$ tree . -a
.
├── a.txt
├── bar
├── b.py
├── collect.py
├── empty
├── foo
│ └── bar.bz.gz2
├── .hidden
│ └── secrect-file
└── martin
└── thoma
└── cv.pdf
gives:
$ python collect.py
bar
empty
.hidden
collect.py
a.txt
b.py
martin
foo
.hidden/secrect-file
martin/thoma
martin/thoma/cv.pdf
foo/bar.bz.gz2
Pretty simple solution would be to run a couple of sub process calls to export the files into
CSV format:
Join Stack Overflow to find the best answer to your technical question, help others
0
answer theirs.
Sign up
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 7/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
import subprocess
# Find the requested data and export to CSV, specifying a pattern if needed.
find_cmd = 'find ' + location + ' -name ' + pattern + ' -fprintf ' +
outputFile + ' "%Y%M,%n,%u,%g,%s,%A+,%P\n"'
subprocess.call(find_cmd, shell=True)
That command produces comma separated values that can be easily analyzed in Excel.
f-rwxrwxrwx,1,cathy,cathy,2642,2021-06-01+00:22:00.2970880000,content-audit.py
The resulting CSV file doesn't have a header row, but you can use a second command to add
them.
Depending on how much data you get back, you can massage it further using Pandas. Here
are some things I found useful, especially if you're dealing with many levels of directories to
look through.
import numpy as np
import pandas as pd
# Format columns
# Get the filename and file extension from the filepath
df['FileName'] = df['FilePath'].str.rsplit("/",1).str[-1]
df['FileExt'] = df['FileName'].str.rsplit('.',1).str[1]
# Get the full path to the files. If the path doesn't include a "/" it's the
root directory
df['FullPath'] = df["FilePath"].str.rsplit("/",1).str[0]
df['FullPath'] = np.where(df['FullPath'].str.contains("/"), df['FullPath'],
rootDir)
# Split the path into columns for the parent directory and its children
df['ParentDir'] = df['FullPath'].str.split("/",1).str[0]
Join Stack Overflow to find the best answer to your technical question, help others
df['SubDirs'] = df['FullPath'].str.split("/",1).str[1] Sign up
answer theirs.
# Account for NaN returns, indicates the path is the root directory
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 8/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
df['SubDirs'] = np.where(df.SubDirs.str.contains('NaN'), '', df.SubDirs)
# Show only files, output includes paths so you don't necessarily need to
display the individual directories.
df = df[df['Type'].str.contains('File')]
# Go through the items and convert the filesize from bytes to something more
readable.
for items in df['Size'].items():
filesize.append(convert_bytes(items[1]))
df['Size'] = filesize
return size
And this is how you list it in case you want to list the files on SharePoint. Your path will
probably start after the "\teams\" part
0
import os
root = r"\\mycompany.sharepoint.com@SSL\DavWWWRoot\teams\MyFolder\Policies
and Procedures\Deal Docs\My Deals"
list = [os.path.join(path, name) for path, subdirs, files in os.walk(root)
for name in files]
print(list)
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 9/10
12/07/2022 19:39 Python list directory, subdirectory, and files - Stack Overflow
Join Stack Overflow to find the best answer to your technical question, help others
Sign up
answer theirs.
https://stackoverflow.com/questions/2909975/python-list-directory-subdirectory-and-files 10/10