I created a Node and Express proxy that rams together two websites / apps. If you are ever asked to develop code for a site you don’t have access to, this could save the day.
- Download the code:
- Watch the video:
My most recent hobby project involved building a Node and Express proxy to ram two websites or apps together. I dubbed it “The Code Collider.”
The idea of the Code Collider was born about two years ago when I was given a rather difficult coding challenge at a client. Another department had a website and we were being asked to iframe it. As you might know, iframes have a lot of “interesting nuances” as they might politely be called, one of which is that if a user right-clicks on a link and chooses “Open in new tab,” they will break out of the parent’s iframe and go directly to the iframed child site. Of all things, I was asked to write some code to prevent this and provide my solution to the other team. There was just one catch: I had no way of running the other team’s site in my local environment and thus (or so I initially thought) no easy way to test my code as I wrote it.
This situation comes up more often than you’d think. There are several reasons a developer might not be able to run a site or app in their local environment. Perhaps it’s complex to set up, involves a lot of permissions, or the other team just can’t be bothered to spare the time to help you out. Regardless of the reason, I did eventually strike upon a novel way to solve the problem.
After what felt like a meandering odyssey of Stack Overflow posts, I managed to cobble together a Node and Express project that took the other team’s site, piped it through a proxy, injected my script into it, and served it to my browser. Even though my innovation worked and I was able to deliver the final code requested, there was just one thing that bothered me: I always had the nagging feeling that I had only gotten the proxy to work through a mixture of stubborn persistence and blind luck. I resolved to rebuild it from scratch in my free time one day, piece by piece, actually understanding how the solution worked – and that as it turns out is exactly what I did.
Here’s a diagram of how it works:
The browser makes a request to our Node Express proxy server where we then have three scenarios:
- If the user requested an HTML page, we need to combine the page from website 1 and 2. The proxy first asks website 1 and then website 2 for its HTML. It then modifies the two HTML files, combines them, and returns the result to the browser. (Details on how this works below.)
- If website 1 does not have the asset, the proxy will then ask website 2 and return it to the browser.
Here are screenshots of the result:
In this example, we’ll use the real InRhythm.com website as the target into which we’ll inject some local code (in this case a basic Create React App project). The final result on the right is an actual screenshot of the 2 websites living together in the same browser window. Now this is pretty exciting; the target website and the code that you are developing could be anything!
How it Works
- As mentioned above, website 1 and 2’s HTML are combined. This involves a few steps. Webpages can’t have 2 doctype, html, head, or body tags, so we’ll use some regex to strip those. Now that website 2’s HTML is ready, we’ll inject it before website 1’s closing </body> tag.
- And here is the code that shows our modifications to website 1’s HTML. It shows a few things.
Firstly, many websites have full ‘absolute URLs’ for their links. They look like this: https://www.inrhythm.com/who-we-are/
The problem is if the user clicks on this, they’ll be taken away from our proxy and go to the target website. We’ll solve this by removing all www.something.com bit and keep just the bit after the slash.
Secondly, we show injecting the CSS discussed above to remove backgrounds and allow clicks to pass through website 2 to website 1. (Keep in mind this will probably be slightly different depending on the two sites you are combining.)
Thirdly, we show injecting the HTML from website 2 we modified from above before the closing </body> tag.
Gotchas to Avoid
As mentioned above, it took a lot of research as well as trial and error to solve all the technical hurdles this project presented. Here are some of the main ones below.
- Websites usually compress or “Gzip” their content. Normally this is a great thing. It means less data is transferred and websites load more quickly. This is not going to work for our case however. You can’t parse, manipulate, and modify HTML if it looks like gibberish. The solution was actually quite simple: as it turns out, there’s a header you can send with your request to ask the server not to Gzip anything.
- There’s another header we need to send however. Because we’re using a proxy, all requests are going to have the header ‘host’ set to ‘localhost’. Now this is probably not a problem for most sites, but to the target server, this doesn’t look like a very normal request, and indeed, I did find some websites responded oddly and returned pages that looked nothing like I expected. The solution again was quite simple, just modify one of the headers of our request.
- Now as you’ve probably gathered from above, we’ve modified our requests quite a lot, and as a result this is going to cause the browser to do some odd things. The solution to this problem is to delete the ‘content-length’ header before the proxy sends your browser any response. This will stop the browser from truncating the response and removing all the hard work we did.
- Finally, I’ll mention one other problem I solved. If you are combining sites that use https, the proxy might complain that the SSL certificates don’t match what it’s expecting. Turns out it’s rather easy to relax this with the following code:
That’s it. Feel free to download the code, follow the instructions in the readme, and open your browser to localhost:8000 . You should be able to see two web sites rammed together.