24 Dec 2020
3 min read

404 page with Publish and GitHub Pages

I have been meaning to try out John Sundell's static site generator Publish for a while now and since holidays during coronavirus pandemic meant more free time than usual I decided to do it. For a Swift enthusiast as myself the tool was a joy to use and I ended up redoing the whole website using it. While most of the things I wanted to do were simple, it took me a bit of research and trial-and-error to figure out how to create my own 404 page.

My website is hosted on GitHub Pages and ultimately it is your hosting solution that prescribes how you should implement your 404 page. GitHub Pages offers a nice documentation page saying you should either have a 404.html or a 404.md page.

First attempt - add 404.md

I first tried simply adding a 404.md page under the Content folder to see what happens. After running publish generate, a new page index.html was generated in Output/404. This is, however, not what GitHub Pages expects. Accessing your.website/404 does indeed show the custom page, however navigating to any other unsupported URL doesn't.

Another downside of this approach is that this way the 404 page gets added to your sitemap.xml, meaning search engines can use it to index your website, and it could turn up in the search, which is also not what you want for your 404 page.

I decided to try alternatives.

Second attempt - copy generated 404 HTML

My second idea was to add a custom step to publishing the website which will copy the generated 404 page to the output and name it as GitHub Pages expects.

try Website().publish(
    ...
    additionalSteps: [
        ...
        .step(named: "Copy 404 page") { context in
            try context.copyFileToOutput(from: "Output/404/index.html", 
                                         to: "Output/404.html")
        }
        ...
    ]
)

This approach also has two flaws. The first one is that it simply does not work, since at the time when this step is executed, the 404 page still hasn't been generated inside the Output folder.

This could probably be solved by listing all the publish steps manually and executing the copy step as the very last one, but the other flaw remains, namely the 404 page is still listed in sitemap.xml.

Therefore I quickly abandoned this approach as well.

Third attempt - 404.html in Resources

My final solution turned out to be really simple. I copied the HTML page generated by my first attempt, renamed it to 404.html and moved it to the Resources folder. Everything that is in the Resources folder gets automatically copied to the Output folder on website generation. The advantage here is that my 404 page is no longer listed in sitemap.xml and, well, the obvious advantage is that this actually works for GitHub Pages 😄.

One downside is that my 404 page is pretty much hardcoded now and in order to change it I have to dig into the HTML and manually change that. This, however, does not pose a big issue as the page itself is actually pretty small and I do not expect to be changing this page that often anyways so this is what I ended up with.

I hope you enjoyed this article

If you notice any mistakes or issues in this article, please let me know at dev@janagrill.io.
Tagged with: