Skip to content Skip to navigation

Protecting your Drupal files (including robots.txt in cgi-bin)

Posted by: 

(via Marco)

> > So this means that *all* .txt files in cgi-bin are readable (e.g.,
> > CHANGELOG.txt, UPGRADE.txt, etc) ?

Yes, that's correct.

> > If so, how would one go about restricting access to those types of
> > files
> > in the AFS environment? In *nix, one would just set the permissions of
> > the individual files, but since AFS ACLs act on the enclosing
> > directory,
> > how do you "pick and choose"?

What you are probably looking for here is a combination of AFS ACLs
and Webauth permissions. You'd want to protect the entire directory
with AFS ACLs (it probably already is) and then pick and choose which
files to Webauth so they can't be accessed through the web.

Here's how.

First, your cgi-bin directory and its subdirectories should have AFS
ACLs that allow just the admins, the servers and the backup demon to
do their job. The permissions are set that way automatically when the
cgi-bin directory is first created by ITS. So, unless you have
modified them, you should be OK. Mine look like this (but if you are
working on a dept or group site, you will have an additional -admins

system:backup rl
system:www-servers rl
system:administrators rlidwka
mrmarco rlidwka
mrmarco.cgi rlidwk

That will protect your files from people trying to access them through
AFS, but of course, they are still visible on the web.

Webauth to the rescue. If all the files you want to hide are in the
same directory, use Webauth in .htaccess to protect the whole
directory. This would work well for the /files subdirectory or
wherever you have your uploads going to.

If the files you want to Webauth are interspersed among files you
don't want to webauth, wrap the Webauth directives within a "Files" or
"FilesMatch" block.

For example, to protect all files called "error_log.txt":

<Files "error_log.txt">
AuthType WebAuth
require user mrmarco

You can have more than one <Files> block in your .htaccess, but it
gets tedious after a while. So if all the files you want to protect
have something in common, you can use a regular expression with a
<FilesMatch> block. For example, suppose you want to protect all .txt
files, you'd do this:

<FilesMatch "\.txt$">
AuthType WebAuth
require user mrmarco

There's a catch though. The files are matched not just in the
directory where the .htaccess file is, but also in its subdirectories,
and in Drupal, that would mean the /files directory as well.
So, be careful.

Another thing you might have noticed with the above is that if you
webauth all .txt files, you'll be webauth'ing the robots.txt file as
well. Not good.

There is a way to de-webauth something I just found out about today
and haven't tested fully:

You can use the "Allow" and "Satisfy" directives together to first put
up a more permissive rule, and then tell Apache that any rule can be
satisfied for access to a particular file. In this case:

<Files "robots.txt">
Allow from all
Satisfy any

Now, all files ending in .txt are webauth'ed, but robots.txt also can
be accessed from all hosts. And, only one rule needs to be satisfied.
In this case, the "allow from all" rule is satisfied and robots.txt
can be accessed without going through webauth. Works for directories,
too, in case you want to de-webauth a subdirectory when the parent
directory is webauth'ed.

To be clear though, I don't think all this is needed for Drupal
because I don't think the .txt files are sensitive. To protect a
Drupal installation I would do the following (other advice gladly

  1. Verify the AFS ACLs for the cgi-bin directory
  2. Turn on "private" downloading of files from within Drupal's administration (you may have to do this anyway for the downloads to work)
  3. Webauth the "files" directory so it can't be accessed directly

The contents of the settings.php file should not be visible on the web
since it's going to be executed.