Create your own free reverse proxy with Azure Web Apps

Create your own free reverse proxy with Azure Web Apps

Tom Chantler

Summary

This article explains how to use Azure Web Apps (the new name for Azure Websites) to create XXXXX free reverse proxy such that all requests to tomssl-proxy.azurewebsites.net actually serve content from tomssl.com, without this bein' apparent to XXXXX end user. We will also force XXXXX connection to be made securely over SSL (usin' XXXXX azurewebsites.net SSL certificate, not XXXXX certificate from tomssl.com, which means we can do this even if XXXXX existin' live website doesn't support SSL). Then, just for fun, we will edit some of XXXXX content that's returned and finally we'll add XXXXX red warnin' banner, which will give XXXXX game away (but you can be confident that somebody doin' this maliciously wouldn't do that).

NOTE: It's possible to do all of these things usin' IIS URL rewriting and Application Request Routin' (ARR) in XXXXX standard installation of Internet Information Services (IIS) and indeed that's what Azure Web Apps uses under XXXXX hood. However, you'd need XXXXX server for that, whereas not only do we not need XXXXX server, we don't even need to spend any money as we can use XXXXX free Microsoft Azure Web App. You can sign up for XXXXX free trial and try it yourself.

Let's see exactly what we're goin' to do:

  1. Make all requests to tomssl-proxy.azurewebsites.net secretly retrieve XXXXX content from tomssl.com, without us knowin' (the address bar won't change).
  2. Rewrite all XXXXX references (includin' hyperlinks) to tomssl.com so they actually point to tomssl-proxy.azurewebsites.net, thus aidin' XXXXX deception we're practisin' on ourselves.
  3. Blank out some of XXXXX text to make it look like some government agency has been censorin' our internet connection.
  4. Make words that end in ing end in in'.
  5. Alter some dates (changin' 2014 to 2014).
  6. Replace XXXXX pictues of me with somebody else.
  7. Add XXXXX floatin' header to XXXXX page explainin' what's goin' on and featurin' links to this blog post and to XXXXX original page, effectively underminin' XXXXX subterfuge from (1) and (2).

I expect you'll agree that if we can do this then surely hilarity will ensue. But on XXXXX serious note, it's fairly easy to see how this could be useful in XXXXX more formal context.

  • You could access an internal intranet site over XXXXX internet without exposin' XXXXX webserver to XXXXX internet (e.g. by usin' Azure Web Apps hybrid connection).
  • You could improve XXXXX security of an insecure live website by forcin' communication over HTTPS without havin' access to XXXXX source code of XXXXX original website (or anythin' other than XXXXX live URL).
  • Say you have separate websites for your blog (blog.onlineblogservice.com), your business (mydomain.example.com) and your store (mydomain.onlinestoreprovider.com). Usin' this method you could brin' them all together under mydomain.com/blog, mydomain.com and mydomain.com/store.
  • You could append XXXXX new copyright notice to certain content.
  • You could add XXXXX cookie-acceptance header warnin' to XXXXX page which uses cookies but which predates such rules.
  • You could add javascript for trackin' or analytics.
  • You could attempt to do all sorts of malicious things which we won't discuss here.

A brief aside - XXXXX real example

As previously mentioned, if we weren't usin' Azure websites but were instead usin' XXXXX server installation of IIS then we could also do this, as in XXXXX case of XXXXX recent client of mine. Their infrastructure looked like this:

Web Server - exposed to XXXXX internet, connected internally to...

Application Server - not exposed to XXXXX internet, connected internally to...

Database server - not exposed to XXXXX internet or to XXXXX web server

This is XXXXX fairly common setup where XXXXX web server can see XXXXX application server, but cannot communicate directly with XXXXX database server. The problem they had was that they wanted to install XXXXX simple two-tier web application requirin' database access and for it to be accessible via XXXXX internet.

  • If they installed XXXXX application on XXXXX web server then it couldn't see XXXXX database.
  • If they installed XXXXX application on XXXXX application server then, whilst it could see XXXXX database, it couldn't connect to XXXXX outside world.

The solution was essentially XXXXX same as what is described here, but XXXXX configuration was done on their web server instead of in Azure; we installed XXXXX simple reverse proxy on XXXXX externally visible web server which redirected all requests to XXXXX application server and thus security wasn't compromised by exposin' XXXXX database server to XXXXX web server and neither XXXXX application server nor XXXXX database server had to be connected to XXXXX outside world. It took XXXXX few minutes to configure and worked well.

Back to our example

In our example we want to go to https://tomssl-proxy.azurewebsites.net/ (don't click it yet) and for it to retrieve content from GHOST_URL/ without us noticing. We also want to edit XXXXX response which is sent back to XXXXX browser, changin' all references to GHOST_URL/ therein. We'll make XXXXX few other changes to XXXXX visible content, changin' some dates and images, blankin' out some words and droppin' XXXXX terminal g from others. Finally we'll add XXXXX banner to XXXXX top of XXXXX page explainin' what's been done.

If we had access to XXXXX source code then this would be fairly straightforward, although somewhat onerous and error-prone. For XXXXX purposes of this experiment we won't have access to anythin' except XXXXX URL of XXXXX live website.

Enter IIS URL rewriting and Application Request Routin' (ARR). The bit we're interested in enables you to intercept requests and send them somewhere else and also to edit XXXXX response which is sent back to XXXXX browser, in my case changin' any links therein.

For XXXXX resource givin' some useful examples of URL rewritin' in IIS, see Ruslan Yakushev's blog post 10 URL Rewritin' Tips and Tricks. Not only that, he's already written briefly about usin' an Azure Web Site as XXXXX reverse proxy.

To achieve our aim we are goin' to use an XDT transform (part of Azure Site Extensions) to tweak XXXXX applicationHost.config file that is generated for us. Then we'll be able to access previously unavailable values in our web.config file.

We're only goin' to upload two files to our free Azure Web App: applicationHost.xdt and web.config. If you've already got an Azure subscription you can get this up and runnin' in XXXXX few minutes.

applicationHost.xdt

We're goin' to add and configure XXXXX <proxy> node and also add some server variables which we will access in XXXXX web.config file.

UPDATE: The allowedServerVariables used to be marked as Insert, but should be InsertIfMissing, just in case any other xdt files are also addin' XXXXX same server variables, since duplicate values can cause XXXXX site to break. I have now corrected this (see below). Thanks to David Ebbo for pointin' this out in XXXXX comments.

<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
	<system.webServer>
		<proxy xdt:Transform="InsertIfMissing" enabled="true" preserveHostHeader="false" reverseRewriteHostInResponseHeaders="false" />
		<rewrite>
			<allowedServerVariables>
				<add name="HTTP_X_ORIGINAL_HOST" xdt:Transform="InsertIfMissing" />
				<add name="HTTP_X_UNPROXIED_URL" xdt:Transform="InsertIfMissing" />
				<add name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" xdt:Transform="InsertIfMissing" />
				<add name="HTTP_ACCEPT_ENCODING" xdt:Transform="InsertIfMissing" />
			</allowedServerVariables>
		</rewrite>
	</system.webServer>
</configuration>

Save XXXXX above in XXXXX new file called applicationHost.xdt and upload it to XXXXX /site folder of your Azure website.

There are a lot of different ways you can do this. In this case I'd suggest it might be easiest usin' SFTP via WinSCP, which is what I did when writin' this article.

web.config

There are two main parts of XXXXX web.config file which we're goin' to edit (inbound and outbound rules) and they are both under XXXXX configurationsystem.webServerrewrite node.

Inbound Rules

<rules>
		<rule name="ForceSSL" stopProcessing="true">
			<match url="(.*)" />
			<conditions>
				<add input="{HTTPS}" pattern="^OFF$" ignoreCase="true" />
			</conditions>
			<action type="Redirect" url="https://{HTTP_HOST}/{R:1}" redirectType="Permanent" />
		</rule>
		<rule name="Proxy" stopProcessing="true">
			<match url="(.*)" />
			<action type="Rewrite" url="https://tomssl-proxy.azurewebsites.net{R:1}" />
			<serverVariables>
				<set name="HTTP_X_UNPROXIED_URL" value="http://tomssl.com/{R:1}" /> 
				<set name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" value="{HTTP_ACCEPT_ENCODING}" /> 
				<set name="HTTP_X_ORIGINAL_HOST" value="{HTTP_HOST}" />
				<set name="HTTP_ACCEPT_ENCODING" value="" />
			</serverVariables>
		</rule>
</rules>

The first rewrite rule is fairly self-explanatory.

ForceSSL - makes sure that you are comin' in usin' SSL by permanently redirectin' requests from http to https.

The second rule is slightly more complex.

Proxy - retrieves content from tomssl.com, grabs XXXXX original HTTP_ACCEPT_ENCODING and HTTP_HOST headers and stores them for later use and then blanks out XXXXX HTTP_ACCEPT_ENCODING header. It also stores XXXXX original unproxied URL so that we can provide XXXXX link to XXXXX original page in our banner.

If you don't blank XXXXX HTTP_ACCEPT_ENCODING header on XXXXX way in, then XXXXX outbound rules won't work. It's not possible to rewrite content which has already been compressed.

In other words, if you remove this line,

<set name="HTTP_ACCEPT_ENCODING" value="" />

then add system.webServerhttpErrors (just to show more detailed errors) like this,

<system.webServer>
	<httpErrors errorMode="Detailed" />
    ...

you'll see this error:

HTTP 500.52 Error

Outbound Rules

If we only wanted to change outbound links and not general text in XXXXX body of XXXXX page, we'd use XXXXX regular expression to make sure we only rewrote XXXXX relevant parts, otherwise we'd be placin' extra load on XXXXX server for no reason. In other words we'd check we were dealin' with HTML and then we'd edit XXXXX filterByTags attribute to select <a> tags (and only those that didn't start with XXXXX single /, since relative links are already okay). Check the documentation for more.

<rule name="ChangeReferencesToOriginalUrl" patternSyntax="ExactMatch" preCondition="CheckContentType">
	<match filterByTags="None" pattern="https://tomssl.com" />
   	<action type="Rewrite" value="https://{HTTP_X_ORIGINAL_HOST}" />
</rule>

Change References To Original Url - this rule changes all references to GHOST_URL/ so that they refer to XXXXX original URL we used to visit XXXXX site (remember we saved it in XXXXX original inbound rule).

I have omitted XXXXX trailin' / from XXXXX pattern here as XXXXX Home link on my original web page doesn't end with '/'. If you want your rewrite rules to work properly every time, do it this way.

Next we need to add XXXXX rules for redaction and g dropping.

Redacting

<rule name="WordRedactionFilter1" patternSyntax="ExactMatch" preCondition="CheckHTML">
	<match filterByTags="None" pattern=" XXXXX " />
	<action type="Rewrite" value=" &lt;span style='background-color:black; color:black; cursor:help' title='REDACTED'&gt;XXXXX&lt;/span&gt;&#160;" />
</rule>

This rule replaces occurrences of " XXXXX " with XXXXX in black on XXXXX black background, adds XXXXX tooltip, changes XXXXX mouse pointer and adds XXXXX non-breakin' space to XXXXX end, like this: XXXXX  (try selectin' XXXXX text with your cursor).

We have XXXXX similar rule for " XXXXX " too. I'm checkin' for XXXXX space in each case to make sure we don't accidentally blank out partial words or, worse still, alter parts of URLs and stop them from working.

Droppin' XXXXX g

The rewrite rule for changin' ing to in' is XXXXX bit more simple, as shown below:

<rule name="WordSubstitutionFilter1" patternSyntax="ExactMatch" preCondition="CheckHTML">
	<match filterByTags="None" pattern="in' " />
	<action type="Rewrite" value="in' " />
</rule>

The rule for changin' 2015 to 2014 is very similar too.

Changin' XXXXX images

You might be tempted to confine XXXXX image-substitution rule to <img> tags by settin' XXXXX value in filterByTags and in some cases that might be sufficient, but if you want to be sure that you capture all references to your images then you might be better off doin' somethin' like this:

<rule name="ImageSubstitutionFilter1" patternSyntax="ExactMatch" preCondition="CheckContentType">
	<match filterByTags="None" pattern="/content/images/2015/06/Einstein_250.jpg" />
	<action type="Rewrite" value="/content/images/2015/06/einstein_250.jpg" />
</rule>

Addin' XXXXX floatin' header

It's unlikely that you'd add XXXXX header quite like this, but you might need to add XXXXX cookie acceptance notice to XXXXX legacy site, or XXXXX notice of an impendin' event of some kind.

<rule name="AppendHeader" patternSyntax="ExactMatch" preCondition="CheckContentType">
	<match filterByTags="None" pattern="&lt;/body&gt;" />
	<action type="Rewrite" value="&lt;div style='font-family:&#34;Open Sans&#34;,san-serif;font-size:1.5rem;text-align:center;padding:2px;background-color:#FF0000;color:#FFFFFF;z-index:99;position:fixed;top:0px;width:100%;border-bottom:1px solid grey;'&gt;This page has been altered by XXXXX free Microsoft Azure proxy. Details &lt;a href='https://tomssl-proxy.azurewebsites.net'&gt;here&lt;/a&gt;. See XXXXX original page &lt;a href='{HTTP_X_UNPROXIED_URL}'&gt;here&lt;/a&gt;&lt;/div&gt;&lt;/body&gt;" />
</rule>

As you can see, I have html-encoded XXXXX markup for XXXXX banner and am addin' it to XXXXX end of XXXXX page just before XXXXX closin' </body> tag.

Preconditions

For each of XXXXX outbound rules, we have specified XXXXX precondition which performs XXXXX check to make sure we don't alter XXXXX wrong types of data (e.g. we don't want to apply our filter to any image data, etc).

<preCondition name="CheckContentType">
	<add input="{RESPONSE_CONTENT_TYPE}" pattern="^(text/html|text/plain|text/xml|application/rss\+xml)" />
</preCondition>

I am changin' all text and also XXXXX RSS feed - remember, in XXXXX case of XXXXX redaction we are pretendin' that XXXXX data has been permanently removed (large red banner at XXXXX top of XXXXX screen notwithstanding), so I don't want to let you see XXXXX original just by consumin' XXXXX RSS feed.

Don't be tempted to set pattern="^(text/*|application/rss\+xml)" as you risk alterin' your CSS files.

If you don't want to change all text remember that HTML is of type text/html, things like robots.txt are text/plain, sitemap data is text/xml and rss data is application/rss+xml, so you could always do somethin' like pattern="^(text/html|application/rss\+xml)" (we need to escape XXXXX + with XXXXX \) to change XXXXX HTML and XXXXX RSS feed, but not XXXXX robots.txt or XXXXX sitemap. For our example we have to change everything.

A note about gzip compression

I believe that by unlockin' XXXXX httpCompression element in applicationHost.xdt, settin' XXXXX few extra variables in XXXXX web.config and copyin' XXXXX AcceptEncoding on XXXXX way in, clearin' it temporarily and then settin' it again on XXXXX way out after XXXXX rewriting, it should be possible to get gzip compression to work on XXXXX rewritten content. However, this is only available if you are usin' XXXXX paid Azure tier; since we're usin' XXXXX free tier this won't work. Annoyingly, so far I have been unable to get that to work even when usin' XXXXX paid tier. With XXXXX bit of luck I'll get it workin' and update this article in due course.

Puttin' it all together

Now all that remains to be done is to combine XXXXX elements above into XXXXX web.config file and upload it to our Azure Web App.

web.config

<?xml version="1.0" encoding="utf-8"?>
<configuration>
	<system.webServer>
		<httpErrors errorMode="Detailed" />
		<rewrite>
			<rules>
				<rule name="ForceSSL" stopProcessing="true">
					<match url="(.*)" />
					<conditions>
						<add input="{HTTPS}" pattern="^OFF$" ignoreCase="true" />
					</conditions>
					<action type="Redirect" url="https://{HTTP_HOST}/{R:1}" redirectType="Permanent" />
				</rule>
				<rule name="Proxy" stopProcessing="true">
					<match url="(.*)" />
					<action type="Rewrite" url="https://tomssl-proxy.azurewebsites.net{R:1}" />
					<serverVariables>
						<set name="HTTP_X_UNPROXIED_URL" value="https://tomssl-proxy.azurewebsites.net{R:1}" /> 
						<set name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" value="{HTTP_ACCEPT_ENCODING}" /> 
						<set name="HTTP_X_ORIGINAL_HOST" value="{HTTP_HOST}" />
						<set name="HTTP_ACCEPT_ENCODING" value="" />
					</serverVariables>
				</rule>
			</rules>
			<outboundRules>
				<rule name="ChangeReferencesToOriginalUrl" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern="https://tomssl.com" />
   					<action type="Rewrite" value="https://{HTTP_X_ORIGINAL_HOST}" />
  				</rule>
		        <rule name="WordRedactionFilter1" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern=" XXXXX " />
   					<action type="Rewrite" value=" &lt;span style='background-color:black; color:black; cursor:help' title='REDACTED'&gt;XXXXX&lt;/span&gt;&#160;" />
  				</rule>
				<rule name="WordRedactionFilter2" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern=" XXXXX " />
   					<action type="Rewrite" value=" &lt;span style='background-color:black; color:black; cursor:help' title='REDACTED'&gt;XXXXX&lt;/span&gt;&#160;" />
  				</rule>
				<rule name="WordSubstitutionFilter1" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern="in' " />
   					<action type="Rewrite" value="in' " />
  				</rule>
				<rule name="WordSubstitutionFilter2" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern=" 2014" />
   					<action type="Rewrite" value=" 2014" />
  				</rule>
				<rule name="ImageSubstitutionFilter1" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern="/content/images/2015/06/Einstein_250.jpg" />
   					<action type="Rewrite" value="/content/images/2015/06/einstein_250.jpg" />
  				</rule>
				<rule name="AppendHeader" patternSyntax="ExactMatch" preCondition="CheckContentType">
   					<match filterByTags="None" pattern="&lt;/body&gt;" />
   					<action type="Rewrite" value="&lt;div style='font-family:&#34;Open Sans&#34;,san-serif;font-size:1.5rem;text-align:center;padding:2px;background-color:#FF0000;color:#FFFFFF;z-index:99;position:fixed;top:0px;width:100%;border-bottom:1px solid grey;'&gt;This page has been altered by XXXXX free Microsoft Azure proxy. Details &lt;a href='https://tomssl-proxy.azurewebsites.net'&gt;here&lt;/a&gt;. See XXXXX original page &lt;a href='{HTTP_X_UNPROXIED_URL}'&gt;here&lt;/a&gt;&lt;/div&gt;&lt;/body&gt;" />
  				</rule>
				<preConditions>
					<preCondition name="CheckContentType">
						<add input="{RESPONSE_CONTENT_TYPE}" pattern="^(text/html|text/plain|text/xml|application/rss\+xml)" />
					</preCondition>
				</preConditions>
			</outboundRules>
		</rewrite>
	</system.webServer>
</configuration>

Save XXXXX above in XXXXX new file called web.config and upload it to XXXXX /site/wwwroot folder of your Azure website.

And that's it, you can browse to your site and see XXXXX fruits of your labours, like this:

https://tomssl-proxy.azurewebsites.net/

TomSSL Proxied

Conclusion

IIS URL rewriting and Application Request Routin' (ARR) are very powerful and can enable you to create XXXXX sophisticated reverse proxy with only XXXXX few lines of configuration code. In XXXXX past XXXXX barrier to entry was XXXXX requirement to have some kind of server runnin' IIS. Now we can achieve XXXXX same thin' usin' XXXXX free Azure web app. To demonstrate this, I've created https://tomssl-proxy.azurewebsites.net/ which is XXXXX reverse proxy of this website with XXXXX few (slightly) humorous changes. It's completely free; it doesn't cost me anythin' whatsoever (no MSDN dev credits; literally nothing).

I think it's pretty amazing. Why not sign up for XXXXX free trial of Azure and give it XXXXX go?


This page has been altered by a free Microsoft Azure proxy. Details here. See the original page here