Software, your way.
burger menu icon
WillMaster

WillMaster > LibrarySnooping (Information Retrieval)

FREE! Coding tips, tricks, and treasures.

Possibilities weekly ezine

Get the weekly email website developers read:

 

Your email address

name@example.com
YES! Send Possibilities every week!

Get CSV List of All Files on Your Domain

The software that comes with this article can create a CSV document of the file name, date, and size of every file in the document root and subdirectories of your domain.

Only the files in the document root and its subdirectories (the document area) will be listed. MySQL files are not in the document area, for example. Neither are cPanel statistics, as another example.

The document area is wherever web pages can be put for display in browsers. Usually, that is the directory where the domain's home or index page file is at, and all of its subdirectories.

A CSV file generally is easier to peruse than navigating a large site looking for certain files. The CSV has all your files listed in one document.

You might be surprised by the number of files in your domain and, perhaps, also by some of the file names.

When you have the CSV file, it can be imported into your spreadsheet software. Use it to see which are the largest files. Spot unnecessary files. There may be some totally unexpected files.

Sort the spreadsheet however you want to better assimilate what your document area contains.

  • Total the size column for the total size of all files in your document area.

  • Sort by file size for a view of the largest files. And perhaps find empty ones.

  • Sort by file date for date-range views.

  • Sort by file names to spot files that might not belong in the document area (or anywhere on your server).

  • Sort by file location for a by-directory view.

The Get File List CSV software was developed with the private server on my Mac which, a surprise to me, contained 1,387,170 files (less, now, after I deleted a bunch). After development, it was tested on 6 of our public domains. All were good, although I did tweak a script that was making more log files than I now needed.

Below is the software source code. Information about one optional customization follows.

<?php
/*
Get File List CSV
Version 1.0
January 15, 2022
Will Bontrager Software LLC
https://www.willmaster.com/
*/

/* Optional customization */
// Specify the CSV output file name.
$outputfilename = 'FileListOutput_'.date('Y-m-d_H-i-s').'.csv';
/* End of optional customization */

mb_regex_encoding('UTF-8');
mb_internal_encoding('UTF-8');
if( ! ini_get('date.timezone') ) { date_default_timezone_set('UTC'); }
set_time_limit(5*60*60);
ini_set('display_errors',1);
error_reporting(E_ALL);
$DocRoot = preg_quote($_SERVER['DOCUMENT_ROOT'],'/');
$OutputFile = fopen($outputfilename,'w');
if( $OutputFile )
{
   fputs( $OutputFile, '"TimeStamp","FileSize","FileName","FileLocation"'."\n" );
   $files = GetDirectoryFiles(__DIR__);
   fclose($OutputFile);
   echo 'OK';
}
else { echo 'Unable to open file to write'; }
exit;
function GetDirectoryFiles($dir)
{
   global $OutputFile, $DocRoot;
   $dir = preg_replace('!/*$!','',$dir);
   $root = scandir($dir);
   $result = array();
   foreach($root as $value)
   {
      if($value === '.' || $value === '..') {continue;}
      $flocation = trim("$dir/$value");
      if(is_dir($flocation))
      {
         foreach(GetDirectoryFiles($flocation) as $value) { $result[]=$value; }
      }
      elseif(is_file($flocation))
      {
         $ta=stat("$dir/$value");
         $linechunks = array();
         $linechunks[] = date('Y-m-d H:i:s',$ta[9]);
         $linechunks[] = $ta[7];
         $linechunks[] = str_replace('"','""',$value);
         $linechunks[] = str_replace('"','""',preg_replace('/^'.$DocRoot.'/','',$flocation));
         fputs( $OutputFile, '"' . implode( '","', $linechunks ) . "\"\n" );
      }
   }
   return $result;
}
?>

Optional customization —

The name/location for the CSV file may be changed.

Currently, the file is created in the directory where the software is running and the file name begins with FileListOutput_, then the date and time, and ends with a .csv file extension.

To change the name/location of the CSV file, change the blue text in the software source code to the name/location you prefer. For reference, here is a copy of the line with the blue text:

$outputfilename = 'FileListOutput_'.date('Y-m-d_H-i-s').'.csv';

This PHP script can be installed anywhere in the document area that allows PHP scripts to run.

Where you upload and run the script affects its output.

  • If the script is installed in the document root, the CSV file will list the files in the document area (the document root and all its subdomains).

  • However, if the script is installed in a subdomain of the document root, the CSV file will list only the files of that subdomain and its subdomains.

That feature allows you to obtain lists of files restricted to certain subdomains if that is what you prefer.

The PHP Get File List CSV script is fast. Depending on your server and how busy it is at the moment, the script shouldn't take more than a few seconds to get you a list of all files in the document area.

This article first appeared with an issue of the Possibilities newsletter.

Will Bontrager

Was this article helpful to you?
(anonymous form)

Support This Website

Some of our support is from people like you who see the value of all that's offered for FREE at this website.

"Yes, let me contribute."

Amount (USD):

Tap to Choose
Contribution
Method

All information in WillMaster Library articles is presented AS-IS.

We only suggest and recommend what we believe is of value. As remuneration for the time and research involved to provide quality links, we generally use affiliate links when we can. Whenever we link to something not our own, you should assume they are affiliate links or that we benefit in some way.

How Can We Help You? balloons
How Can We Help You?
bullet Custom Programming
bullet Ready-Made Software
bullet Technical Support
bullet Possibilities Newsletter
bullet Website "How-To" Info
bullet Useful Information List

© 1998-2001 William and Mari Bontrager
© 2001-2011 Bontrager Connection, LLC
© 2011-2024 Will Bontrager Software LLC