• Skip to main content
  • Skip to footer

InRhythm

Your partners in accelerated digital transformation

  • Who We Are
  • Our Work
  • Our Culture
  • Events
  • Testimonials
  • Blog
  • Contact Us

Dec 21 2020

Lightning Talk: Creating a Proxy with Node + Express

tldr;

I created a Node and Express proxy that rams together two websites / apps. If you are ever asked to develop code for a site you don’t have access to, this could save the day.

  1. Download the code:
    https://github.com/mattbillard/code-collider-v2
  2. Watch the video:

My most recent hobby project involved building a Node and Express proxy to ram two websites or apps together. I dubbed it “The Code Collider.” 

The idea of the Code Collider was born about two years ago when I was given a rather difficult coding challenge at a client. Another department had a website and we were being asked to iframe it. As you might know, iframes have a lot of “interesting nuances” as they might politely be called, one of which is that if a user right-clicks on a link and chooses “Open in new tab,” they will break out of the parent’s iframe and go directly to the iframed child site. Of all things, I was asked to write some code to prevent this and provide my solution to the other team. There was just one catch: I had no way of running the other team’s site in my local environment and thus (or so I initially thought) no easy way to test my code as I wrote it. 

This situation comes up more often than you’d think. There are several reasons a developer might not be able to run a site or app in their local environment. Perhaps it’s complex to set up, involves a lot of permissions, or the other team just can’t be bothered to spare the time to help you out. Regardless of the reason, I did eventually strike upon a novel way to solve the problem.

After what felt like a meandering odyssey of Stack Overflow posts, I managed to cobble together a Node and Express project that took the other team’s site, piped it through a proxy, injected my script into it, and served it to my browser. Even though my innovation worked and I was able to deliver the final code requested, there was just one thing that bothered me: I always had the nagging feeling that I had only gotten the proxy to work through a mixture of stubborn persistence and blind luck. I resolved to rebuild it from scratch in my free time one day, piece by piece, actually understanding how the solution worked – and that as it turns out is exactly what I did.

The Architecture

Here’s a diagram of how it works:

The browser makes a request to our Node Express proxy server where we then have three scenarios:

  1. If the user requested an HTML page, we need to combine the page from website 1 and 2. The proxy first asks website 1 and then website 2 for its HTML. It then modifies the two HTML files, combines them, and returns the result to the browser. (Details on how this works below.)
  2. The HTML page will then request the CSS, JavaScript, and other assets it requires. These requests will again go through the proxy which will pass on the requests. If website 1 has the asset, great, the proxy will return it to the browser. 
  3. If website 1 does not have the asset, the proxy will then ask website 2 and return it to the browser.

Here are screenshots of the result:

In this example, we’ll use the real InRhythm.com website as the target into which we’ll inject some local code (in this case a basic Create React App project). The final result on the right is an actual screenshot of the 2 websites living together in the same browser window. Now this is pretty exciting; the target website and the code that you are developing could be anything!

How it Works

  1. As mentioned above, website 1 and 2’s HTML are combined. This involves a few steps. Webpages can’t have 2 doctype, html, head, or body tags, so we’ll use some regex to strip those. Now that website 2’s HTML is ready, we’ll inject it before website 1’s closing </body> tag. 
  1. And here is the code that shows our modifications to website 1’s HTML. It shows a few things.

    Firstly, many websites have full ‘absolute URLs’ for their links. They look like this: https://www.inrhythm.com/who-we-are/
    The problem is if the user clicks on this, they’ll be taken away from our proxy and go to the target website. We’ll solve this by removing all www.something.com bit and keep just the bit after the slash.

    Secondly, we show injecting the CSS discussed above to remove backgrounds and allow clicks to pass through website 2 to website 1. (Keep in mind this will probably be slightly different depending on the two sites you are combining.)

    Thirdly, we show injecting the HTML from website 2 we modified from above before the closing </body> tag.

Gotchas to Avoid

As mentioned above, it took a lot of research as well as trial and error to solve all the technical hurdles this project presented. Here are some of the main ones below.

  1. Websites usually compress or “Gzip” their content. Normally this is a great thing. It means less data is transferred and websites load more quickly. This is not going to work for our case however. You can’t parse, manipulate, and modify HTML if it looks like gibberish. The solution was actually quite simple: as it turns out, there’s a header you can send with your request to ask the server not to Gzip anything.
  1. There’s another header we need to send however. Because we’re using a proxy, all requests are going to have the header ‘host’ set to ‘localhost’. Now this is probably not a problem for most sites, but to the target server, this doesn’t look like a very normal request, and indeed, I did find some websites responded oddly and returned pages that looked nothing like I expected. The solution again was quite simple, just modify one of the headers of our request. 
  1. Now as you’ve probably gathered from above, we’ve modified our requests quite a lot, and as a result this is going to cause the browser to do some odd things. The solution to this problem is to delete the ‘content-length’ header before the proxy sends your browser any response. This will stop the browser from truncating the response and removing all the hard work we did. 
  1. Finally, I’ll mention one other problem I solved. If you are combining sites that use https, the proxy might complain that the SSL certificates don’t match what it’s expecting. Turns out it’s rather easy to relax this with the following code: 

Conclusion

That’s it. Feel free to download the code, follow the instructions in the readme, and open your browser to localhost:8000 . You should be able to see two web sites rammed together. 

Written by Matt Billard · Categorized: Cloud Engineering, InRhythmU, Learning and Development, Web Engineering · Tagged: express, Node.js, proxy

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Footer

Interested in learning more?
Connect with Us
InRhythm

140 Broadway
Suite 2270
New York, NY 10005

1 800 683 7813
get@inrhythm.com

Copyright © 2021 · InRhythm on Genesis Framework · WordPress · Log in

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.

Necessary Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.