Directory Tree Depth: Report
-
Filenames should be descriptive and readable but not self documenting. That's not their purpose.
-
Filing and sorting is something that is always a bit of this and what 'fits'. Ultimately it's up to the person to decide how they want to arrange it.
When I was doing basic computer instruction I gave the example of Music since 95% of people understand music and know what it is.
I started with Top of the tree My Music, then genre, Group or Artist, Album, song.
But that is pretty shallow on the directory tree list. I"m seeing some as many as 10 or 11 folders deep before the files.
This one backup report I'm reviewing has 136,745 files with 11,486 folders for 108,405,977 KB - just seems as if it's a bit deep.
-
This is where traditional filesystems have broken down and why more modern things like Sharepoint with flat storage and heavy metadata tend to work so much better.
-
@scottalanmiller said:
This is where traditional filesystems have broken down and why more modern things like Sharepoint with flat storage and heavy metadata tend to work so much better.
Heavy meta data to enable searches? What populates the metadata portion?
-
@Dashrender said:
Heavy meta data to enable searches? What populates the metadata portion?
The same thing that creates folders and filenames... humans.
-
@scottalanmiller said:
@Dashrender said:
Heavy meta data to enable searches? What populates the metadata portion?
The same thing that creates folders and filenames... humans.
Good metadata requires a lot of consideration - rarely do I find good folder structure, hence people are loosing things all the time.
-
@Dashrender said:
Good metadata requires a lot of consideration - rarely do I find good folder structure, hence people are loosing things all the time.
In which case the value of the organization is moot and all that matters is the shorter filenames and not making things hard for other people.
-
Here is an example in Python.
import os mypath = input("What starting path would you like? ") filelist = [] for (dirpath, dirnames, filenames) in os.walk(mypath): for name in filenames: namelength = len(os.path.join(dirpath, name)) fullname = os.path.join(dirpath, name) print(str(namelength) + " Name: " + fullname)
-
Now let's take this up a notch. Rather than just printing a list, let's create a dictionary (aka a hash or a map) that we can then sort. This will not only allow us to look for the biggest offenders but will also allow us to filter out the shorter filenames that we don't care about.
import os import operator mypath = input("What starting path would you like? ") limit = input("Only show filenames longer than? ") filelist = [] filedict = {} for (dirpath, dirnames, filenames) in os.walk(mypath): for name in filenames: namelength = len(os.path.join(dirpath, name)) fullname = os.path.join(dirpath, name) filedict[fullname] = namelength for offenders in sorted(filedict, key=filedict.get, reverse=True): if filedict[offenders] > int(limit): print(offenders, filedict[offenders])
-
This is actually a problem in PowerShell too because it has the same 256 limit!! Which I just don't understand why this hasn't been permanently fixed--problem has only been around for a decade or so!
Anyway, to accomplish in PowerShell the best bet is to use Robocopy to just list the directories, then it's child's play to get the lengths > 256 and display. It's the Robocopy that's a pain, luckily:
http://thesurlyadmin.com/2014/08/04/getting-directory-information-fast/
Not exactly on topic, but it has the code for building an array with the data in it.