DirWalker Class Tutorial

I could tell you how many hours it takes to develop a MODX extra Transport Package complete with a build script, properties, multiple MODX elements, internationalized strings, error checks, and then fully test it, but you wouldn't believe me. If you use this extra and like it, please consider donating. The suggested donation for this extra is $10.00, but any amount you want to give is fine (really). For a one-time donation of $50.00 you can use all of my non-commercial extras with a clear conscience.


PayPal

DirWalker is a utility class for recursively walking through directories and either aggregating paths and filenames or processing files as they are found.

DirWalker Class Overview

By default, DirWalker creates an associative array of files by recursively walking through directories and (optionally) their descendants. The array will contain the full path to the file (including the filename) as the key and the full filename of the file as the value. It's important to have the full path as the key in order for the array keys to be unique. By extending DirWalker and overriding its processFiles() method, you can process the files as they are found. Directories can be excluded from the search. Files to include and exclude can be specified by strings or regex patterns in the filename. See the files in the core/components/dirwalker/docs/examples directory for examples.

Installing DirWalker

Go to System | Package Management on the main menu in the MODX Manager and click on the "Download Extras" button. That will take you to the Revolution Repository (AKA Web Transport Facility). Put DirWalker in the search box and press Enter. Click on the "Download" button, and once the package is downloaded, click on the "Back to Package Manager" button. That should bring you back to your Package Management grid. Click on the "Install" button next to DirWalker in the grid. The DirWalker package should now be installed.

Usage

DirWalker is simply a class file. When you install it, the class file is placed in the core/components/dirwalker/model/dirwalker directory. No snippets or resources are installed.

In order to use DirWalker in a script, instantiate it with the following code at the top of the file:

if (!defined('MODX_CORE_PATH')) {
    define('BASE_PATH', 'C:/xampp/htdocs/addons/');
    require BASE_PATH . 'config.core.php';
    require MODX_CORE_PATH .  'config/' . MODX_CONFIG_KEY . '.inc.php';
}
include MODX_CORE_PATH . 'components/dirwalker/model/dirwalker/dirwalker.class.php';

The code above allows DirWalker to be used inside or outside of MODX, though to use it outside of MODX, you'll need to correct the BASE_PATH setting. Set it to the full path to your MODX root directory and be sure to include a trailing slash. If you will be using DirWalk inside MODX, only the last line of the code above is necessary.

Here is an example that will list all the .php files in your MODX install. It excludes the cache, setup, _build, and packages directories and any Git files or directories as well as combined or minimized files. It will take a while to run.

$searchStart = MODX_BASE_PATH;
$dw = new DirWalker();
$dw->setIncludes('.php');
$dw->setExcludes('.git,-all,-min');
$dw->setExcludeDirs('cache,setup,_build,packages,.git');
$dw->dirWalk($searchStart, true);

$fileArray = $dw->getFiles();

foreach($fileArray as $path => $filename) {
   echo "\n" . $filename);
}

echo "\n\n" . count($fileArray) . ' php files found';

In the foreach loop, $path will be the full path to the file and $filename will be the full name of the file. Inside the loop, you could process the file, looking inside it with $content = file_get_contents($path) and/or altering its content.

DirWalk Methods

The following sections describe the various methods of the DirWalk class. The three set*() methods are optional. If you don't call any of them, all files found will be added to the results.

dirWalk($dir, $recursive)

This is the main function. The first argument is the directory to start in (with or without a trailing slash). The second should be either true or false, depending on whether you want dirWalk() to recursively search the directories below the starting directory. The dirWalk() method has a third argument, but you should never send it in the call. It is used internally by the method and should always be empty to begin with.

processFile($dir, $file)

This method is called by the dirWalk() method and should never be called directly. If you need it to do something else, extend the DirWalk class and override it. In the core/components/dirwalker/docs/example/process-files-as-found.php file, there is an example showing how to do that. Overriding the processFile() method is useful if you want to process the files as they are found rather than using the full file list after the search is completed.

setIncludes($includes, $useRegex)

This method allows you to set the pattern used to include files in the search. The includes argument is a single string. It can contain a comma-separated list of strings that must be in the filename or a comma-separated list of regular expression (regex) patterns that must be matched in the filename. If you use the regex feature, set the second argument to true, otherwise, you can omit it.

If you send multiple strings or regex patterns, a file that matches any of them will be included.

setExcludes($excluded, $useRegex)

This method lets you exclude files based on a comma-separated list of strings or regex patterns. It works exactly like setIncludes()

If you send multiple strings or patterns, any file that matches any of them will be excluded from the search.

setExcludeDirs($excludes)

This method allows you to exclude directories by sending a comma-separated list of strings in the argument. Any directory that contains any of the strings in its name will be skipped.

getFiles()

This method is used after calling dirWalk() to retrieve the array of files found in the search. It returns an array of key/value pairs with the full path to the file (including the filename) as the key and the full name of the file as the value.

resetFiles()

This method sets the $this->files array to an empty array. If you have called dirWalk() and retrieved the files and need to call dirWalk() again on a different directory, call this first to empty the file list. You may call any of the set*() methods again before calling dirWalk() to change the search parameters. There is an example of this in the core/components/dirwalker/docs/example/process-files-as-found.php file.

has*()

The three has*() methods are used internally and should not be called directly, though you are free to override them in a derived class if you want to change their behavior.

 

My book, MODX: The Official Guide - Digital Edition is now available here. The paper version of the book is available from Amazon.

If you have the book and would like to download the code, you can find it here.

If you have the book and would like to see the updates and corrections page, you can find it here.

MODX: The Official Guide is 772 pages long and goes far beyond this web site in explaining beginning and advanced MODX techniques. It includes detailed information on:

  • Installing MODX
  • How MODX Works
  • Working with MODX resources and Elements
  • Using Git with MODX
  • Using common MODX add-on components like SPForm, Login, getResources, and FormIt
  • MODX security Permissions
  • Customizing the MODX Manager
  • Using Form Customization
  • Creating Transport Packages
  • MODX and xPDO object methods
  • MODX System Events
  • Using PHP with MODX

Go here for more information about the book.

Thank you for visiting BobsGuides.com

  —  Bob Ray