Understanding .htaccess VI

Useful .htaccess tricks that don't involve rewrite conditions and rules


In the last article we looked at some practical examples of rewrite rules and how they work. In this one, we'll see some things you can do in .htaccess that don't involve rewrite conditions and rules.


MODX logo

There are lots of things you can do in .htaccess besides rewrite things. We can't cover all of them here, but this article will explain and give examples of some of the common ones.

For examples of useful rewrite rules, see the previous article in this series.


Denying and Allowing

This is one of the most difficult areas of .htaccess to understand. It's also complicated by the fact that the whole system is deprecated in Apache 2.4.0+. We'll look at the old version first. It still works at this writing, but it's deprecated and could stop working in future versions of Apache. In the following article, we'll take a look at some of the features and syntax introduces in Apache 2.4.0.

What confuses people is mainly the order clause, which can take either of these forms:

Order allow,deny
Order deny,allow

The Order line is followed by an number of lines starting with Allow from or Deny from. The Order line determines the order in which they will be evaluated.

If the Order is allow,deny, all the Allow from lines will be evaluated first (regardless of where they appear in the list!). Any request that doesn't match at least one Allow from line, is rejected and the process stops. If the request passes the Allow rules by matching at least one of them, the Deny from lines are processed. Any request that matches a Deny from rule is rejected, even though an Allow from rule was matched.

The reverse order, deny,allow, works like this. First, all Deny from rules are processed. If the request matches any of them, it will be rejected *unless* the request matches one of the Allow from rules.

Denied requests will get a 403 - Forbidden response by default. If you'd like to send denied request to another page, you can put something like this above the Order line:

ErrorDocument 403 http://yoursite.com/unauthorized-page.html

Which order you use depends on what you want to do. If you want to allow everyone except a few bad actors, you can do something like this:

Order allow,deny
Allow from all
Deny from 173.254.224.112
Deny from 95.211.15.80

You can also use a single Deny from line with spaces separating the entries:

Deny from 173.254.224.112 95.211.15.80

There is a limit to how long the line can be (lines that are too long get a 500 server error), but it is quite generous — around 6,848 characters. I generally keep the length to the size of my editor's window.

Because the order in which the lines are evaluate depends on the Order line, this code would do exactly the same thing as our example above:

Order allow,deny
Deny from 173.254.224.112
Deny from 95.211.15.80
Allow from all

Because the order is allow,deny, the last line will still be evaluated first.

In a localhost install, you might want to do the opposite. Allow no IP but the local one:

Order deny,allow
Deny from all
Allow from 127.0.0.1 localhost

In the code above every request is rejected unless it comes from your local machine.

When specifying IP addresses in an allow or deny rule, you can leave off the end of the IP address to block all IPs that begin with the number you use:

    Deny from 156.12.12.
    

You can also put a range in the final place by using two numbers separated by /:

    Deny from 156.12.12.0/2
    

The line above would block 156.12.12.0, 156.12.12.1, and 156.12.12.2, but not addresses where the final section was 3 or above.

If you want to block or allow requests from specific countries, see the country IP lists here


Controlling Access to Specific Files

You can deny or allow access to specific files by using <Files> or <FilesMatch> sections in .htaccess.

For example, to protect a particular file called data.php, you can do this:

<Files data.php>
    Deny from All
</Files>

The following code will deny access to .htaccess itself. Notice that we don't have to escape the dot, since this is not a regular expression.

<Files .htaccess>
    Deny from all
</Files>

Important: The Files

Remember that the string being matched is the filename itself. The rest of the URL is not available to match.

The part between the tags is a regular Allow/Deny section, so you can, for example, only allow requests from your own domain to access, like this:

<Files data.php>
    Order deny,allow
    Deny from All
    Allow from yourdomain.com
</Files>

You can use wildcards in your <Files> tag, but it is not a regular expression. In the tag, the ? character matches any single character. The * character matches any string of characters. This line would match all .php files:

<Files *.php>

You can make your match use a true regular expressions by adding ~ followed by space in front of the pattern, but it is not recommended. Instead, you could use <FilesMatch>, which is easier to read and has more predictable behavior.

To match multiple files, people usually use <FilesMatch>. With that tag, the second part of the tag is always a regular expression used to match the desired filenames. The following code would deny access to all files that begin with a dot. We have to escape the dot with a backslash, otherwise it will match any character and we'll be denying access to all files on the site.

<FilesMatch ^\.>
    Deny from all
</FilesMatch>

The following code would deny requests for all .jpg or .jpeg files regardless of case except those coming from your own site:

    <Files ~ ^.*\.(jpe?g|JP?EG)$>
        Order deny,allow
        Deny from all
        Allow from yoursite.com

    </Files>

Since the final part of the tag is a regular expression, the ., *, and ? symbols, now take on their usual meanings in a regular expression. The . (if not escaped) matches and character, the * matches zero of more of the preceding element, and the ? makes the preceding element optional.


Controlling Access to Specific Directories

Controlling access to directories works exactly like using <Files> and <FilesMatch> except that we use <Directory> and <DirectoryMatch>. All the rules we described in the previous section apply, except that the full path is evaluated instead of just the filename.

Important: The four file and directory tags discussed here can only be used with actual physical files and directories. If you're using a Content Management system that stores pages in the database and creates them on the fly based on requests, these four directives won't work because there's no physical file, and often no physical directory. They can still be used, though, for any actual files (e.g., images, css files, javascript files, downloadable files, etc.), and for any directories that actually exist on the server, whether or not the requested file exists.

The following code will deny access to any directories where the directory name is three decimal digits:

<DirectoryMatch ^(.+)/?([0-9]{3})/>
    Deny from all
</DirectoryMatch>

Important: If there is a space in a directory or filename, you must either express it with \s or enclose the entire search string in double quotes. Otherwise, the rewrite engine will interpret the space as a delimiter separating two parts of the condition, rule, or directive.


AllowOverride

The AllowOverride directive determines what can be overridden in .htaccess files below the current directory. If it appears in the server configuration file (httpd.conf) it affects the whole site. For example, this line in the server configuration file will tell Apache to ignore all .htaccess files at the site.

AllowOverride none

If you are on a shared host that has this directive and the server configuration file and won't remove it. None of the material in this series of articles will work. Changing hosts would necessary if you wan't to implement any of the stuff we've discussed in this series.

On the other hand, this line will allow overriding anything:

AllowOverride All

The AllowOverride directive is only allowed in <directory> sections that don't contain regular expressions. It is not allowed in <Files>, <FilesMatch>, <Location>, or <DirectoryMatch> sections.


Options

In order to use the Options directive, you must have AllowOverride Options set (or AllowOverride All). It can be in the server config file, a virtual host specification, a directory specification that specifies the current directory or one above it, or an .htaccess file higher up in the directory tree.

The most commonly used option in .htaccess controls how Apache responds to request for a file in a directory when the file that doesn't exist and there is no rewrite rule that applies. The default action is to display a listing of the files and directories in the requested directory. Prevent that and display no index list, you can do this in .htaccess:

Options -Indexes

To turn indexing on for a directory, you can do the opposite:

Options +Indexes

The line above would only be necessary if -Indexes were set at a higher level. Again, AllowOverride Options or AllowOverride All would have to be in effect for the directive to be applied.

With the first version (Options -Indexes), you can specify what files will be selected in place of the bare index display:

Options -Indexes
DirectoryIndex index.php index.html index.htm home.htm home.html /index.php

If a file isn't found in the directory and no rewrite rule applies, Apache will search for the filenames listed in the DirectoryIndex line in the order they occur in the line until it finds one. A common default order for shared servers is index.html, index.php. That means that if you have an index.html file and an index.php file, the first one will be displayed and the index.php file will never execute unless it's specifically named in the request.

Notice that the final entry on the DirectoryIndex line above will send the user to the index.php file in the root directory of the site (if it exists) when none of the other files are found in the current directory.


Forcing Downloads

If you want a particular file type to be offered for download instead of displayed or executed, you can use this directive:

AddType application/octet-stream .avi .mpg .mov .pdf package.php .xls .mp4

Here's another method, which will let you force individual files to be offered for download:

<FilesMatch "\.pdf$">
ForceType application/octet-stream
Header set Content-Disposition attachment
</FilesMatch>

<FilesMatch package.php>
ForceType application/octet-stream
Header set Content-Disposition attachment
</FilesMatch>

Unfortunately, the methods above are really suggestions to the browser. They may be ignored depending on the browser and how it is configured. As a last resort, you can always place the file in a compressed archive (e.g., package.zip). Compressed files will almost always be offered for download.


Custom Error Document

If you would like to display a custom HTML file for error responses, you can do this in .htaccess:

ErrorDocument 404 /page-not-found.html
ErrorDocument 403 /forbidden.html

The code above assumes that the two files listed in the third part of each line exist in the site root. You can create a custom error page for any standard HTTP error code, but the two above are the most common. Many CMS (Content Management System) platforms handle some errors on their own. They almost always handle 404 - Not Found errors. In that case, you would create your own error pages inside the CMS rather than using .htaccess.

Custom error documents can cause trouble in some Content Management Systems (CMSs) and other sites that depend on the fact that the physical file requested does not exist. In that case, the .htaccess file contains -f and/or -d flags to trigger actions when a file or directory is not found so that the CMS can generate the page itself from the database. If there is a physical, custom 404 page, though, those actions may never occur. Usually, the file-not-found condition is handled internally by the CMS, which may have its own custom error page.


Set Server Timezone

You can use the SetEnv directive to set the timezone for your server:

SetEnv TZ America/Chicago

See this list for all valid TZ values.


Set Server Admin

This directive also uses SetEnv. It registers the Server Admin's email address, which it sometimes displayed in server-generated error pages:

SetEnv SERVER_ADMIN youremail@gmail.com

Set PHP Variables and Flags

You can set PHP variable values and flags in .htaccess, but it won't always work:

php_flag display_errors Off
php_flag register_globals Off
php_value post_max_size 3M
php_value upload_max_filesize 15M
php_value memory_limit 128M
php_value max_execution_time 300

Many host set things up so that any of the above lines will lead to a crash with a 500 Internal Server Error response. On most of those hosts, though, you can use a php.ini file to do the same things. Often the host will configure things so that only some of the directives will be honored, but usually the others will be ignored instead of throwing an error.

In php.ini, you leave off the first part and insert an equals sign:

display_errors = Off
register_globals = Off
post_max_size = 3M
upload_max_filesize = 15M
memory_limit = 128M
max_execution_time = 300

Default Character Set

This lets you set the default character set which will be used by the server is the character set is not specified elsewhere:

AddDefaultCharset UTF-8

Coming Up

In this article we looked at some things you can do in .htaccess that don't involve rewrite conditions or rules. In the next article, we'll see some of the new options that arrived in Apache 2.4.0 to replace some of the directives above, which are now deprecated.



Looking for high-quality, MODX-friendly hosting? As of May 2016, Bob's Guides is hosted at A2 hosting. (More information in the box below.)


——

Comments (0)


Please login to comment.

  (Login)