Find that Placeholder II

Finding placeholders in included files


In the last article, we looked at a utility snippet that let's you find the code that's setting a particular placeholder. Our snippet had a drawback, though. It didn't look in any files that the snippet or plugin being examined pulled in with include or require. Usually, these would be class files, but not always. We'll fix that problem in this article by adding a function to search those files as well.

MODX logo

As a reminder, here's our snippet tag:

[[!FindPlaceholder? &placeholder=`SomePlaceholder`]]

Checking Included Files

In addition to checking the code of the snippet or plugin being processed, we need to look for our placeholder in any files that are pulled into the snippet or plugin being examined with include, include_once, require, or require_once. The first step is to find the file paths for any included files. We could search for each of those terms individually and parse the line they appear on, but we can do it faster by using a regular expression to get any and all of them at once and extract the file name in the process (more on this in a bit). We may get a few spurious matches, but when we try them, no file will be found, so no harm will be done.

We'll add another function: findIncludes(). This function uses a regular expression search (preg_match_all() ) to find the target strings and extract the filenames. If we find any, we'll check to make sure the file exists, pull in its content, and send that content to our existing checkContent() function.

Notice that in our two calls to checkIncludes(), we set the $type argument to 'file' so the report will say 'file' instead of 'snippet' or 'plugin.'

Here's the modified code:

/* FindPlaceholder snippet */

$ph = $modx->getOption('placeholder', $scriptProperties);

function checkContent($content, $ph, $type, $name, &$output) {
    $found = 0;
    if (strpos($content, $ph) !== false) {
        $found = 1;
        $output .= "\n<p> " . $type . ' ' . $name .
            ' contains placeholder ' . $ph . '</p>';
    }

    return $found;
}

function checkIncludes($content, $ph, $type, $name, &$output) {

    $pattern = <<<EOT
/(?:include|require)(?:_once)*\s+['"]([^'"]+)['"]/i
EOT;

    $matches = array();
    $count = 0;

    $success = preg_match_all($pattern, $content, $matches);

    if (!empty($success )) {

        foreach($matches[1] as $path) {
            /* Skip includes that have a PHP variable in them */
            if (strpos($path, '$') !== false) {
                continue;
            }

            /* Make sure the file exists */
            if (file_exists($path)) {
                /* Make sure it's not a directory */
                if (is_dir($path)) {
                    continue;
                }
                $c = file_get_contents($path);
                $type = 'file';
                if (!empty($c)) {
                    $count += checkContent($c, $ph, $type, $path, $output);
                }
            }
        }
    }

    return $count;
}

$output = "\n\n<br>
<h3>Searching Snippets</h3>";

if (!empty($ph)) {

    $count = 0;
    $type = 'snippet';

    $snippets = $modx->getCollection('modSnippet');

    foreach ($snippets as $snippet) {
        $name = $snippet->get('name');
        $content = $snippet->get('snippet');

        $count += checkContent($content, $ph, $type, $name, $output);
        $count += checkIncludes($content, $ph, 'File', $name, $output);

    }
    if ($count == 0) {
        $output .= '<p>Placeholder not found in snippets</p>';
    }

} else {
    return "\n<p>Error: Placeholder property is empty</p>";
}

$output .= "\n\n<br>
<h3>Searching Plugins</h3>";

$count = 0;
$type = 'plugin';
$plugins = $modx->getCollection('modPlugin');
foreach ($plugins as $plugin) {
    $name = $plugin->get('name');
    $content = $plugin->get('plugincode');

    $count += checkContent($content, $ph, $type, $name, $output);
    $count += checkIncludes($content, $ph, 'File', $name, $output);
}
if ($count == 0) {
    $output .= "\n<p>Placeholder not found in plugins</p>";
}
return $output;

This code is the same as the code in the previous article except that we've added the checkIncludes() function and the two lines that call it (one for snippets and one for plugins). The comments in the code explain what we're doing.


The Pattern

We've used the heredoc syntax for the pattern used by preg_match_all(). The heredoc syntax is very handy when you have a string with a lot of single and double quotation marks and want them all preserved as written without having to escape many of them with a backslash, which makes the string much harder to create and to read.

Here is the pattern:

/(?:include|require)(?:_once)*\s+['"]([^'"]+)['"]/i

The two slashes at the ends are delimiters. They show the regex engine where the pattern begins and ends. The i after the last slash tells the regex engine that we want a case-insensitive search.

The pattern looks for any one of our key terms: include or require ((?:include|require)), followed by 0 or more instances of _once ((?:_once)*, followed by any number of spaces (\s+), followed by a quote character (['"]), followed by any series of non-quote characters ([^'"]), followed by a quote character (['"]).

The three sets of parentheses tell the engine what we want to capture (sort of). Actually the first two sets are prefixed with ?:, which makes them "non-capturing groups. The engine will look for them when matching the pattern, but won't store them. The third set of parentheses captures the serious of characters between that quotes — the actual path to the file being included.


Using preg_match_all()

The preg_match_all() function finds all the strings in the text that match the pattern and puts them into a PHP array. The first argument is the pattern, the second one is the text string to be searched and the third is the variable to put the array of matches into. If no matches are found, or there's an error, the return value will be empty. In the case of an error, the return value will be false. You can't tell from our code, but the third argument ($matches in our case) is passed by reference. the preg_match_all() function modifies it and the changes are available in our code.

Unlike the preg_match() function preg_match_all() sets the third argument ($matches) to an array of arrays. The first member of the array ($matches[0]) is an array of all the full matches for the pattern. The second one ($matches[1] is an array containing the matches of the first capture group. Since we only have one capture group (for the file paths), that's all we need. If there were another capture group, we'd have to look at $matches[2] as well.

Here's an example:

$s = "
include \"hello1.php\";
include_once 'hello2.php';
require 'hello3.php';
require_once 'hello4.php';
/* include hello5 */
";

$pattern = <<<EOT
/(?:include|require)(?:_once)*\s+['"]([^'"]+)['"]/i
EOT;

preg_match_all($pattern, $s, $matches);
echo print_r($matches, true);
exit;

The output of our example looks like this:

Array
(
    [0] => Array
        (
            [0] => include "hello1.php"
            [1] => include_once 'hello2.php'
            [2] => require 'hello3.php'
            [3] => require_once 'hello4.php'
        )

    [1] => Array
        (
            [0] => hello1.php
            [1] => hello2.php
            [2] => hello3.php
            [3] => hello4.php
        )

)

The first inner array (which is seldom used, but very handy for debugging the pattern) is the full pattern match for each find. The second one is the captured group for each match (just the file path). Notice that the fifth string (hello5) is not in the results because it doesn't match our pattern. In our checkIncludes() function we loop through the $matches[1] array, the list of file paths, processing each one with our checkContents() function.


Coming Up

Experienced MODX developers almost never hard-code the full path to an included file in their code. Instead, they use the MODX constants, MODX_CORE_PATH, MODX_BASE_PATH, or MODX_ASSETS_PATH as a prefix on their file paths. That way, the paths will still be correct of the site is moved to another directory or another server. Our code will fail to find those files unless we translate the constants into their actual values. We'll do that in the next article.


Looking for high-quality, MODX-friendly hosting? As of May 2016, Bob's Guides is hosted at A2 hosting. (More information in the box below.)



Comments (0)


Please login to comment.

  (Login)