Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
397 views
in Technique[技术] by (71.8m points)

mod rewrite - How to block multiple mod_rewrite passes (or infinite loops) in a .htaccess context

I'm working on a website running on a shared Apache v2.2 server, so all configuration is via .htaccess files, and I wanted to use mod_rewrite to map URLs to the filesystem in less-than-completely-straightforward way. Just for example's sake, let's say that what I wanted to do was this:

  • Map URL www.mysite.com/Alice to filesystem folder /public_html/Bob
  • Map URL www.mysite.com/Bob to filesystem folder /public_html/Alice

Now, after several hours work carefully designing the ruleset (the real one, not the Alice/Bob one!) I put all my carefully crafted rewriting rules in a .htaccess file in /public_html, and tested it out ...only to get a 500 server error!

I'd been caught out by a well documented "gotcha!" in Apache: When mod_rewrite rules are used inside a .htaccess file, a re-written URL is re-submitted for another round of processing (as if it were an external request). That happens so that any rules in the target folder of the re-written request can be applied, but it can result in some very counter-intuitive behaviour by the webserver!

In the above example, that means that a request for www.mysite.com/Alice/foo.html gets rewritten to /Bob/foo.html, and then resubmitted (internally) to the server as a request for www.mysite.com/Bob/foo.html. This is then re-rewritten back to /Alice/foo.html and resubmitted, which causes it to get re-re-rewritten to /Bob/foo.html, and so on; an infinite loop ensues... broken only by a server timeout error.


The question is, how to ensure that a .htaccess mod_rewrite ruleset only gets applied ONCE?


The [L] flag in a RewriteRule stops all further rewriting during a single pass through the ruleset, but doesn't stop the entire ruleset from being re-applied after the re-written URL is resubmitted to the server. According to the documentation, Apache v2.3.9+ (currently in Beta) contains an [END] flag that provides precisely this functionality. Unfortunately, the web host is still using Apache 2.2, and they declined my polite request to upgrade to the beta version!

What's needed is a workaround that provides similar functionality to the [END] flag. My first thought was that I could use an environment variable: Set a flag during the first rewriting pass that would tell subsequent passes to do no further rewriting. If I called my flag variable 'END', the code might look like this:

#  Prevent further rewriting if 'END' is flagged
RewriteCond %{ENV:END} =1
RewriteRule .* - [L]

#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done
RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

Unforunately this code doesn't work: After a bit of experimentation, I discovered that environment variables don't survive the process of re-submitting the rewritten URL to the server. The last line on this Apache documentation page suggests that environment variables ought to survive internal redirects, but I found that not to be the case.

[EDIT: On some servers, it does work. If so, it's a better solution than what follows below. You'll have to try it for yourself on your own server to see.]

Still, the general idea can be salvaged. After many hours of hair-pulling, and some advice from a colleague, I realised that HTTP request headers are preserved across internal redirects, so if I could store my flag in one of those, it might work!


Here's my solution:


# This header flags that there's no more rewriting to be done.
# It's a kludge until use of the END flag becomes possible in Apache v2.3.9+
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RequestHeader set SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1 env=END


# If our special end-of-rewriting header is set this rule blocks all further rewrites.
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
RewriteRule .* - [L]


#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done

RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

...and, it worked! Here's why: Inside a .htaccess file, directives associated with various apache modules execute in the module order defined in the main Apache configuration (or, that's my understanding, anyway...). In this case (and critically for the success of this solution) mod_headers was set to execute after mod_rewrite, so the RequestHeader directive gets executed after the rewrite rules. That means the the SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj header gets added to the HTTP request iff a RewriteRule with [E=END:1] in its flag list gets matched. On the next pass (after the re-written request is resubmitted to the server) the first RewriteRule detects this header, and aborts any further rewriting.

Some things to note about this solution are:

  1. It won't work if Apache is configured to run mod_headers before mod_rewrite. (I'm not sure if that's even possible, or if so, how unusual it'd be).

  2. If an external user includes a SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj header in their HTTP request to the server, it'll disable all URL rewriting rules, and that user will see the filesystem directory structure "as-is". That's the reason for the random string of ascii characters at the end of the header name - it's to make the header hard to guess. Whether this is a feature or a security vulnerability depends on your point of view!

  3. The idea here was a workaround to mimic the use of the [END] flag in Apache versions that don't yet have it. If all you wanted was to ensure your ruleset only runs once, regardless of which rules are triggered, then you could probably drop the use of the 'END' environment variable and just do this:

    RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
    RewriteRule .* - [L]
    
    RequestHeader set SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1
    
    #  Map /Alice to /Bob, and /Bob to /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    Or even better, this (though the REDIRECT_* variables are poorly documented in the Apache v2.2 documetation - they seem to be only mentioned here) - so I can't guarantee it'd work on all versions of Apache):

    RewriteCond %{ENV:REDIRECT_STATUS} !^$
    RewriteRule .* - [L]. 
    
    #  Map /Alice to /Bob, and /Bob to /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    However, once you're running Apache v2.3.9+, I expect that using the [END] flag would be more efficient than the above solution, because (presumably) it altogether avoids the rewritten URL being re-submitted to the server for another rewriting pass.

    Note that you may also want to block rewriting of subrequests, in which case you can a RewriteCond to the don't-do-any-more-rewriting rule, like this:

    RewriteCond %{ENV:REDIRECT_STATUS} !^$ [OR]
    RewriteCond %{IS_SUBREQ} =true
    RewriteRule .* - [L]
    
  4. The idea here was a workaround to replace the use of the [END] flag in Apache versions that don't yet have it. But in fact you can use this general approach to store more than just a single flag - you could store arbitrary strings or numbers that would persist across an internal server redirect, and design your rewrite rules to depend on them based on any of the test conditions RuleCond provides. (I can't, off the top of my head, think of a reason why you'd want to do that... but hey, the more flexibility and control you have, the better, right?)


I guess anyone who's read this far has figured out that I'm not really asking a question here. It's more a matter of my having found my own solution to a problem I had, and wanting to post it up here for reference in case anyone else has run into the same problem. That's a big part of what this webiste is for, right?

...

But since this is supposed to be a question-and-answer forum, I'll ask:

  • Can anyone see any potential problems with this solution (other than those I've already mentioned)?
  • Or does anyone have a better way of achieving the same thing?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Depending on your Apache build, this condition may work (add it to "stop-rewriting" rule: i.e. RewriteRule .* - [L] .. or just for specific problematic rule):

RewriteCond %{ENV:REDIRECT_STATUS} ^$

REDIRECT_STATUS will be empty of very first / initial rewrite and will have value of 200 (or maybe other value as well -- have not checked that deep) on any subsequent cycle.

Unfortunately it works on some systems and does not on others and I personally have no idea what is responsible for making it working.

Other than this the most common thing is to add rewrite condition to check the original URL, for example by parsing %{THE_REQUEST} variable e.g. RewriteCond %{THE_REQUEST} ^[A-Z]+s.+.phpsHTTP/.+ -- but this only makes sense for individual problematic rules.

In general -- you should avoid such "rewrite A -> B and then B -> A" situations (I'm pretty sure you are aware of that).

As for your own solution -- "don't fix if it ain't broken" -- if it works then it's great as I do not see any major problems with such approach.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...