How to use Hugo with GitLab CI/CD to automatically publish asciidoctor and markdown content into a static website hosted in AWS.

Special thanks to the quality content at Bad Sector Labs, from which I’ve drawn inspiration to get my own static site up and running.

The source code for this website is available at: gitlab.com/nimnaij/jianmin.dev

Note
In August 2022, the blog theme was updated and the .gitlab-ci.yml file was changed. What you read here is no longer accurate.

Why I chose static

While platforms like Blogger, Squarespace, Ghost, or Wordpress provide effortless website solutions, most cost money or have ads and are restrictive in terms of how you can customize it. Alternatively, I can self-host a CMS using an open-source variant, but that risks buggy software (looking at you Wordpress) and potentially high maintenance. My preferred option, and the topic of this post, is to use a static website.

A static website is one with fixed content. In contrast, a dynamic website uses a server-side language to build the final content on the fly as it responds to a request from a browser. A static website is usually written in some form of a markup language. This is then used to generate the final HTML product ahead of time to be uploaded to a web server and served directly to users without needing any server-side script execution first.

When looking for a blogging solution, I did not want to pay excessive costs for some managed solution or have ads in a free tier. Self-hosting at home or adding a new VPS instance is not ideal, as I’ve just got done migrating most of my VPS content to AWS S3 and AWS Lambda for my dynamic content.[1] I wanted a blog solution that was low maintenance, low cost, and could scale easily if needed. A static website fits all of these criteria. As an added bonus, it is code-driven, has a lower attack surface, scales effortlessly, and because it is static you can host on GitLab and GitHub Pages.

Hugo

The two big players in static website generation are Jekyll (ruby) and Hugo (go), with an honorable mention to Pelican for the python fans out there. There are some great themes for all three projects, but Jekyll felt very javascript-heavy in its themes and ruby is not my preferred language. Pelican’s community felt smaller than the other projects, especially in available themes. Between Hugo and Jekyll, I really didn’t want to deal with ruby and Hugo comes with a standalone static binary so I tried it out first. It was a great experience out of the box and I haven’t looked back.

I went with the hugo-dusk theme, but modified it a bit, taking some inspiration from this Jekyll Hacker Blog theme and increasing the max-width a bit for wider screens. My fork of the theme is available on GitLab.

Markup Languages

Out of the box, Hugo primarily supports markdown, but their documentation lists support for various other markup tooling, called "helpers," including RST, asciidoc/asciidoctor, and pandoc. They caution that support for these helpers are in infancy. They hard-code some of the arguments passed to the helpers, which reduces flexibility in configuring the helper.

Markdown is great but sometimes I want a little bit more. I have experience with asciidoctor so I decided to spend some time making sure asciidoctor looks right. You can preview some of the adoc features in action in this demo page. This post itself is also written in asciidoctor.

A challenging part with asciidoctor is that Hugo hard codes the arguments passed to asciidoctor, locking it in the "safe" mode, and it doesn’t provide a way to modify them. Instead, to get asciidoctor working in a meaningful way, I had to add a wrapper script earlier in my PATH[2] that tweaks the arguments and adds extra options before calling the real asciidoctor build command (see referenced Dockerfile in next section).

Note
As an aside, asciidoctor has a plugin that provides support for mathematical formulas LaTeX-style, meaning there is no good reason to ever use LaTeX for a website.

Automating Build and Deploy

Building and deploying is fairly straightforward with GitLab CI. I simply define my build and deploy jobs in a YAML, and let GitLab handle the rest. I added my own personal runner to handle the build jobs, but there are shared runners available for free as well.

I spent some time tinkering with a before_script section in .gitlab-ci.yml and was trying to cache my dependencies after install, but eventually I decided it was easier just to make my own custom docker image on Docker Hub and pull it in directly.

Docker Images

Working with Docker Hub is straightforward, but I was frustrated to learn that you have to grant Docker Hub full access to your GitHub account if you want to automate building. They also support bitbucket[3] though, so instead I created a bitbucket account exclusively for docker images and threw my Dockerfiles in repositories there. After some trial and error, I got the images working how I wanted.

Build

The build image is available on Docker Hub at: nimnaij/hugo-adoc You can reference the Dockerfile for a quick look at the wrapper for asciidoctor.

Testing my code with full adoc support becomes as simple as:

docker run -v "$(pwd)/hugo:/hugo" -w /hugo -p 0.0.0.0:1313:1313 \
--name adoc-dev nimnaij/hugo-adoc:latest \
hugo server -D -w --baseUrl=localhost --bind="0.0.0.0"

Deploy

The deploy image is fairly simple and I copied the Bad Sector Labs example by including some minifying tools to help condense the files before throwing online. The tools pull from npm so I used node as my base image and then pull awscliv2 installer down and run that on top of it.

The build image is available on Docker Hub at: nimnaij/node-aws

Putting It All Together

The final product, which lets me build and deploy to AWS with nothing but a push (and a manual click to deploy in this configuration).

My.gitlab-ci.yml:

variables:
  HUGO_VERSION: 0.65.3
  GIT_STRATEGY: clone
  GIT_SUBMODULE_STRATEGY: recursive

stages:
  - build
  - deploy


build:
  stage: build
  tags:
    - docker
  image: nimnaij/hugo-adoc:latest # hugo and asciidoctor together
  script:
    - cd hugo && hugo
    - mv public ../
    - cd .. && find public/
  artifacts:
    paths:
      - public/
    expire_in: 1 week # don't store forever, 
    #but enough time to publish latest manually

deploy:
  stage: deploy
  tags:
    - docker
  image: nimnaij/node-aws
  variables:
    AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
    AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
    CLOUDFRONT_DIST_ID: $CLOUDFRONT_DIST_ID
    AWS_REGION: "us-east-2"
    AWS_S3_BUCKET_NAME: "jianmin.dev"
  script:
    - aws --version
    - cd public
    - find . -iname \*.html -type f | xargs -I {} htmlminify -o {} {}
    - find . -iname \*.css -type f| xargs -I {} uglifycss --output {} {}
    - find . -iname \*.js -type f | xargs -I {} uglifyjs -o {} {}
    - ls -lahrt .
    - aws s3 sync . s3://${AWS_S3_BUCKET_NAME} 
      --region ${AWS_REGION} --acl public-read 
    - test -z "${SKIP_INVALIDATION}" && 
      aws cloudfront create-invalidation 
      --distribution-id $CLOUDFRONT_DIST_ID --paths '/*' 
    # manually set SKIP_INVALIDATION 
    #to a nonzero value in order to skip steps to invalidate cache
    # remember, aws charges per path submitted, 
    #with the first 1k being free per month. 
    #so /* is cheaper than individually submitting paths.
  only:
    - master
  when: manual
Warning
Keep your secrets in GitLab’s CI/CD settings. Don’t put them in your source code.

Final Thoughts on Alternative Approaches

When choosing a provider, I prioritized uptime over price. I want a reasonable steady-state cost, but I don’t want my server to fall over if a post gets popular.

I had previously been paying $5/month on a Digital Ocean droplet to host multiple domains and had enough storage to host my own static website if necessary. But with the low amount of traffic I was getting, I was able to migrate to lambda and static content and stay under the free tier for most of my usage. Including Route 53 costs for DNS ($.50/month per domain), I pay maybe $1/month at current traffic rates on this domain, including handling and forwarding email in lambda.[4]

But on a separate domain, I recently shared a ~5GB ova for use by local high schools to prepare for a CTF competition. I learned quickly that a couple hundred users can make a small S3 bill into a larger S3 bill. If you concerned with price, consider using GitLab or GitHub Pages instead. Publishing is even easier and you can still add a custom domain.

I prefer and am familiar with GitLab’s CI tooling, but it’s no longer the only show in town. If I was starting from scratch, I might checkout GitHub’s new Actions feature and publish directly to GitHub Pages.

That said, I hope this post has been helpful and informative. If you have any feedback, questions, or issues, feel free to ping me on Twitter.

Glossary:

CI/CD

Continuous Integration, Continuous Delivery, Continuous Deployment; in this context, it refers to GitLab’s built-in tooling to process code after it is committed for things like building, testing, and deploying.

AWS

Amazon Web Services. The monolithic beast that is Amazon’s Cloud Computing Services. Pricey, but the best in class.

CMS

Content Management System; see Wikipedia.

HTML

HyperText Markup Language;

VPS

Virtual Private Server; see Wikipedia. I have used Linode, AWS EC2, and Digital Ocean. I prefer Digital Ocean, although I haven’t used Linode since they have redone their web interface.

VPN

Virtual Private Network; the value of commercial VPNs is disputed, but for self-hosted I recommend wireguard or openvpn (the latter supporting duo 2FA).

CSS

Cascading Stylesheets; the part of the website that makes it look nice. If HTML is the wood frame of a house, CSS is the paint, wallpaper, carpet, and decorations. See more at w3schools.

RST

reStructuredText; see more at docutils.sourceforge.io.

YAML

YAML Ain’t Markup Language (recursive); "a human-friendly data serialization standard" - from yaml.org. It’s like JSON but you can add comments.

DNS

Domain Name System; it is why your internet isn’t working. See Wikipedia[https://en.wikipedia.org/wiki/Domain_Name_System]

CTF

Capture the Flag; these are offensively-focused computer security competitions. Learn more at ctf101.org.

Footnotes:


1. Most of my dynamic needs fit in a lambda function. For example, I use https://api.jianmin.ninja/ip to check my public IP. Depending on the content, this may not always work. YMMV.
2. If you’re unfamiliar with the PATH environmental variable, you can read more about it here: https://linuxhint.com/path_in_bash/
3. I am not an Atlassian fan, and I hate that bitbucket forces you to have javasript enabled just to load the dumb page. 0/10 would not recommend. I probably could have just made a second GitHub account, but then that would get confusing as well so I plan to just never open the website.
4. I must caveat this by clarifying it is easy to start playing with other tools in AWS and ending up with a much larger bill