Understanding .htaccess VII

New .htaccess rules in Apache 2.4.0


In the last article we looked at some .htaccess directives that don't involve rewrite conditions or rules. Some of those are deprecated in Apache 2.4.0. In this one, we'll look at the new directives and how they're used.


MODX logo

Apache 2.4.0+ — The New Rules

The new rules for controlling access are quite different in Apache 2.4.0+, and not particularly well documented. They allow finer control. Sometimes, they use fewer lines of code. Sometimes they require more lines of code. They're difficult to understand at first. The old methods we described in the earlier articles in this series still work at this writing, though they are deprecated. They will probably continue to work for some time, so you may choose to wait until the documentation and examples are clearer before upgrading. You are *strongly* advised not to mix the old and new methods, though.

The following sections express my understanding of the new rules, but take them with a grain of salt, because I may be misunderstanding some of the finer points. Unfortunately, they are very difficult to test. If you want to ban requests from a malicious user's IP address, for example, you won't be able to tell if it's working unless you watch your access logs for visits from the miscreant.

The rules below by themselves are interpreted to apply to the whole site (assuming that the .htaccess file is in the site root). They can be applied to specific files and directories by enclosing them with the Files, FilesMatch, Directory, and DirectoryMatch tags we saw in the previous articles in this series.

In the new style, the general allow and deny rules start with the <Require> tag. The old allow from all and deny from all, now look like this:

<Require all granted>
<Require all denied>

Here's a simple example that denies request from servers other than your own the old way.

Old Version:

Order Deny,Allow
Deny from all
Allow from yoursite.com

This is the new version. Note that host here is not the same as %{HTTP_HOST}. It refers to the host of of the site visitor.

Require host localhost
Require ip 127.0.0.1

The new code above is simpler and easier to understand, but you can make it even shorter. This does the same thing:

Require local

The Require rules may or may not work on a localhost install, especially if you have virtual hosts set up. Require ip 127.0.0.1 and Require local will work, but any Require host hostname directive may fail. So far, I haven't been able to make Require host localhost work on a virtual host in my XAMPP install, and I've tried for several days.


The code below will allow all requests except those from banned IPs:

Old version:

Order allow,deny
Allow from all
Deny from 173.254.224.112
Deny from 95.211.15.80

New version:

Require all granted
Require ip not 173.254.224.112
Require ip not 95.211.15.80

I was confused at first by the use of the word Require in these directives. It's used in several different ways.

In the second and third lines above, it's used in the traditional way. It's actually "requiring" that the ip not be one of the two specified ips.

In the first line above, though, it's not really "requiring" anything. It's just identifying the line as an access control line. That line "grants" access to "all."

In the RequireNone examples below, the word Require could be seen as having a third meaning, since it actually gains an invisible "not" in front of it by being inside a RequireNone section.


Controlling Access by Method

The Require method directive lets you restrict which methods can be used:

Require method GET POST HEAD

The code above allows only those three methods to be used. If the line is in the .htaccess file in the site root, only those methods will be allowed throughout the site. You could also restrict the method further for a specific file or directory. Say, for example, that you have a form on a page called contact.html that posts its data to another page called response.php with this code:

<form action="https://yoursite.com/response.php" method="post">
</form>

You can make sure that the response.php page can only be accessed by a request where the method is post with this directive:

<Files response.php>
    Require POST
</Files>

Require Sections

With the new rules, you can also create sections with <RequireAll></RequireAll>, <RequireAny></RequireAny>, and <RequireNone></RequireNone>. These enclose rules like the ones above. Notice that these are tags, unlike the Require directives we saw above. They require a matching closing tag.

RequireAll sections require that all of the enclosed conditions are met.

RequireAny sections require that any one of the enclosed conditions are met.

RequireNone sections require that none of the enclosed conditions are met.

Important:. Negated conditions (containing "not") can *only* be used in RequireAll sections.

RequireAll — Say you want a part of your site to only be available from nine am to five pm. This code will do it (we'll discuss the expr later in this article:

<Directory "/customer/chat">
  <RequireAll>
      Require expr %{TIME_HOUR} -ge 9
      Require expr %{TIME_HOUR} -lt 17
  </RequireAll>
</Directory>

The odds are that you'd want to have a custom error page to deliver during the hours when chat is not available. You could add this line just below the top Directory line:

ErrorDocument 302 /chat-not-available.html

RequireAny — Say that only three people are allowed to access the /manager/ directory, and they all visit from dedicated IPs. Something like this will prevent others from accessing that directory:

<Directory "/manager/">
    <RequireAny>
        Require ip 115.23.21.3
        Require ip 192.0.0.1
        Require ip 111.23.23.9
    </RequireAny>
</Directory

RequireNone — Banning specific IPs:

<RequireNone>
    Require ip 12.145.12.1
    Require ip 112.12.223.48
    Require ip 124.19.23.9
</RequireNone>

Nested Requires

Important: The "users" and "groups" listed below are *not* MODX users or groups, they are users and groups established as part of the Server's security system. They're here as examples of a complex authorization scheme.

<Directory /personnel/>
    <RequireAll>
        <RequireAny>
            Require user superadmin
            <RequireAll>
                Require group administrators
                <RequireAny>
                    Require group deans
                    Require group presidents
                    Require group vice-presidents
                </RequireAny>
            </RequireAll>
        </RequireAny>
        <RequireNone>
            Require group banned
            Require group staff
        </RequireNone>
    </RequireAll>
</Directory>

The code above shows a nested set of requirements for access. Both of the main sections below the the outer RequireAll section must pass. Let's look first at the last one, the RequireNone section. For that section to pass, the user can't be a member of the banned or staff groups. Members of those groups will not be granted access no matter what other groups they belong to because the RequireNone section is inside a RequireAll section.

Assuming that the user is not a member of the the banned or staff groups, the RequireNone section will not block them, but they'll have to satisfy the conditions in the other main section: RequireAny because that's also inside the outer RequireAll section.

That RequireAny section, contains two parts:

The first part is the first line, which allows access to the superadmin user. Because it's inside a RequireAny section, the superadmin user passes even if he or she doesn't belong to any of the groups.

The Second part is another RequireAll section, which in turn contains a RequireAny section. The user must be a member of the administrators group *and* they must also be a member of any one of the three groups named in the innermost RequireAny section: deans, presidents, and vice-presidents.

If the user is not the superadmin and is also not in the administrator group, they don't get access.

If the user is not the superadmin and *is* in the administrator group, they still don't get access unless they are in one of the three groups named in that inner RequireAny section.

Remember that all of this only applies to the /personnel/ directory. It has no effect on the rest of the site. The directives above could apply to a specific file, a list of files, or files with a particular extension instead of a directory.


Using Expressions

Expressions, which are preceded by expr are a powerful tool for evaluating strings in Require sections. We saw this example earlier in the article for denying access to the /customer/chat directory outside of business hours:

<Directory "/customer/chat">
  <RequireAll>
      Require expr %{TIME_HOUR} -gt 9
      Require expr %{TIME_HOUR} -lt 17
  </RequireAll>
</Directory>

It could be written more simply like this:

<Directory "/customer/chat">
   Require expr %{TIME_HOUR} -ge 9 && %{TIME_HOUR} -lt 17
</Directory>

As you can probably guess, it requires that the hour be greater than or equal to 9 and less than 17 (5 pm). This is *not* a regular expression (though there's a way to use regular expressions). The -ge operator means greater than or equal to and the -lt operator means less than. The && is a logical AND. The two comparison operators perform an integer comparison. If you wanted to compare strings alphabetically, you'd use <= and <.

Here's a table showing the comparison operators and what they do:


OperatorDescription
==String equality
!=String inequality
<String less than
<=String less than or equal
>String greater than
>=String greater than or equal
=~String matches regular expression
!~String does not match regular expression
-eqInteger equality
-eqInteger inequality
-gtInteger greater than
-ltInteger less than
-geInteger greater than or equal
-leInteger less than or equal

You can see the full, technical definition of expressions here. It's not as complicated as it looks at first.

You can also use regular expressions (complete with backreferences) in expressions by using the =~ and !~ operators shown in the table above. The server variables, in the format %{VariableName}, are also available. Here's an example would force the content type to text/plain if the query string contains /forceplain/.

<If "%{QUERY_STRING} =~ /forceplain/">
    ForceType text/plain
</If>

The example above also uses an <If> section. If the first line passes, any lines inside the <If></If> section will be applied. You can also use <Else> and <ElseIf> tags. The part after the ~ is a regular expression, so if you wanted to apply the rule if the query string contained either plain or force or forceplain, you could do this:

<If "%{QUERY_STRING} =~ /(force|plain)/">

Notice that the expr is not used here. It's only necessary in lines using Require.

Here's an example that redirects non-www URLs to the www. version:

<If "%{HTTP_HOST} == 'yoursite.com'">
Redirect permanent "/" "http://www.yoursite.com/"
</If>

This one does the opposite:

<If "%{HTTP_HOST} != 'yoursite.com'">
    Redirect permanent "/" "http://www.yoursite.com/"
</If>

Redirect directives don't always work in content management systems that route all requests through index.php. In those cases, it's often better to use rewrite conditions and rules (which are still available in Apache 2.4.0+) to do the redirecting.


Wrapping Up

We've only just scratched the surface of what you can do in an .htaccess file, but I hope I've given you enough information to get started at it.


Coming Up

In this final article in the "Understanding .htaccess" series, we looked at the new .htaccess rules available in Apache 2.4.0+ and how they work. We also saw a number of examples. In the next article we'll look at some methods of managing paths in your development environment when working on MODX extras.



Looking for high-quality, MODX-friendly hosting? As of May 2016, Bob's Guides is hosted at A2 hosting. (More information in the box below.)



Comments (0)


Please login to comment.

  (Login)