This project has two posts spanning a year, about a ten-minute read.

  1. Optimizing image sizes in Hexo
  2. Lazy image resizing in hexo-image-sizes v2
  • Optimizing image sizes in Hexo

    3 March 2017Post 1 of 2 in hexo-image-sizes

    I use Hexo to generate my site. It’s a static site generator, and it follows the typical paradigm of keeping posts (in Markdown format) in a source folder, along with images and other resources. The idea is to keep all the “originals” in the source directory, and then use the site generator to render them to their final display format, into an output directory.

    The problem is that source images are large. I want to keep them in their original full resolutions, because I might want to target bigger displays or allow people to download them, and in general because I want the source directory to be a complete archive of the site’s content.

    Instead of serving these huge images to the user, it would be better to make them as small as possible without degrading the experience of looking at them. How big they need to be depends on what they’re for. If they’re thumbnails in a gallery, for example, they should be really small files! A full-width banner image should obviously be larger. But none of these likely needs to be the original size, which is often on the order of 4 MB.

    Dynamic CMSs like Wordpress already solve this problem, by serving images derived from the ones you upload, resized based on your theme’s settings. For a static site, putting out images with different viewing “profiles” should be just as easy, but I didn’t see a way to do that with Hexo, so I wrote a plugin.


    The hexo-image-sizes plugin lets you define image profiles, which specify the optimal size for the image, and then it takes care of generating the images for you. Each source image is then available for use in the original format and in the format of each of the profiles.

    You can read about how to use it on Github. I’m not going to cover that here; instead, I’d like to discuss some of the design decisions involved in making it.

    How it works

    The Hexo API

    hexo-image-sizes is a Hexo plugin, which means it uses the Hexo API to interact with your site’s files. Hexo has a pretty flexible API, though the documentation is a little too terse for comfort sometimes.

    The plugin API revolves around these core concepts:

    • Plugins are npm modules whose names start with "hexo-"
    • Hexo loads plugins automatically at start and provides them with the hexo global instance object, which includes your site information
    • Text files and (and anything else that registers) are “rendered” through renderer functions which receive the file data and return its rendered form
    • Other files, like images, get copied directly to the output directory
    • Plugins can register processor functions to listen for filenames matching a certain pattern and do things with the filenames


    My first attempt was to make a renderer for the images, to intercept them in the pipeline and resize them. There are two problems here.

    The first problem is that renderers are designed to transform the input file into some other, final format, and they can’t control the name of the output, or how many outputs there should be. So if we want to create multiple different sizes of one source image, a renderer isn’t the way to go.

    The other problem is that we don’t really want to read the image data into Javascript-land. The image files can be huge, and Hexo loads asynchronously and seemingly without any throttling all the files for which renderers have registered, so the memory penalty of loading all images in order to render them is not acceptable.


    Instead, we want to keep the image handling as low-level as possible. The awesome sharp module uses a native-compiled libvips under the hood to handle images, so the memory overhead should be much lower if we can just pass sharp the name of an input file and the path to which to write the resized image.

    The “processor” abstraction in Hexo gives us visibility into the stream of files Hexo is processing for the site. Each processor is simply a function, and it is invoked with a Hexo File object as its only argument. The important keys in this object are the “type” and “source”.

    • The file type indicates what Hexo is doing with it. It’s one of these strings: "create", "update", "skip", "delete" (as seen in the source code; this is an example of where the docs are lacking). For now, I only look at files that we’re creating or updating, but in the future I should probably handle deleting files as well.
    • The source contains the file’s full path on disk.

    Now we simply need to tell sharp this file’s source and where to write the output.

    Wither output?

    So here’s the ugly hack of using a processor for generating these images: we receive information about the input file, but we have to guess the name of the output file. From what I can tell, Hexo simply uses the public_dir configured in _config.yml as the base path, and then tacks on the path of the input file relative to the source directory. However, that could change, and my plugin wouldn’t work anymore.

    I name the output files based on the original filename and the name of the current profile. So if we have a profile called “thumbnail”, then an image called “cat.jpg” will end up in the output directory at full size as “cat.jpg” and as a thumbnail at “thumbnail-cat.jpg”.

    There’s one more complication, which is that the plugin will run when Hexo starts up. Unfortunately, that means we’re running ahead of the renderers, and it could be that none of the output directory structure exists yet. After deciding on the full output path, we have to make sure that path exists. I’m using mkdirp to create a directory with all of its intermediate ancestors if it doesn’t already exist.

    Using the images

    So we’ve resized the images, and now the output directory contains a copy of the original and potentially several resized copies with various names. We could just use one of these new filenames in our Markdown posts directly, but that situation is brittle, because changing the names of profiles could cause images not to show up at all.

    While we’re messing with the images, it’s a good time to note that Markdown doesn’t have a mechanism for captioning images, besides repeating some HTML snippet for each one. This too is brittle, and mixes elements of the site’s theme into the source posts, which is not what we want.

    Unfortunately (or maybe fortunately, depending on your bent), Markdown doesn’t have an official means of extension, so I can’t build support for captions and image profiles directly. Hexo allows Nunjucks tags in posts through its Tag API, which is the next best option. The upside is that we get a lot more options for controlling the output now, but the tradeoff is that we’re no longer writing standard Markdown, and our posts depend on this plugin now. I considered this trade for a few days, and decided this was the best course, but it was a tough call.

    The imsize tag

    I wanted a tag that is as self-documenting as possible, so if this plugin dies or the posts need to migrate to a different platform, it will be as easy as possible to support these image tags.

    There are two choices for passing data to Nunjucks tags: “args” and “content.” A tag looks like this:

    {% blockquote David Levithan, Wide Awake %}
    Do not just seek happiness for yourself. Seek happiness for all.
    Through kindness. Through mercy.
    {% endblockquote %}
    • args is an array formed from the string passed in the tag line. Here, it is [“David”, “Levithan,”, “Wide”, “Awake”]. Hexo parses the arg line here.
    • content is the string between the opening and closing tags.

    Given that I want self-documenting tags, I’m not too worried about brevity. Copy-and-paste is pretty easy, and authors can set up editor shortcuts to insert these tags if needed.

    Passing arguments through the arg line becomes unreadable for more than even a few arguments. The spec for that blockquote tag is:

    {% blockquote [author[, source]] [link] [source_link_title] %}

    These are all positional arguments, and any of them can be elided! Handing that is not easy, and it doesn’t scale nor is it self-explanatory later.

    Instead, I decided to embed a YAML document in the tag’s content. YAML is very easy to read, and I don’t have to write a parser for it. By using a YAML document in the tag content, I can add more arguments in the future without breaking existing documents.

    Tags look like this:

    ![Dell Precision 5510 repair](../uploads/2017/01/05/5510-repair.jpg)

    We can specify the src and profile, and alt text. In the future, we could add a caption, and potentially even an output format (.webp, for example). I’d like to add those possibilities soon.

    Since this is normal YAML, other people can write parsers for this tag for their blogging platforms if needed. It’s a little wordy, but I think it’s very clear.

    There’s more to come on this project, hopefully soon! I’m already using this plugin to generate the images on this site. Head over to the Github repo to try it out!

  • Lazy image resizing in hexo-image-sizes v2

    23 April 2018Post 2 of 2 in hexo-image-sizes

    I just released version 2 of hexo-image-sizes, which represents a complete rewrite of the plugin. It finally supports lazy image resizing. In this post, I will cover the high-level design ideas behind the new version and some of the challenges in making it.

    Check out my first post on it for more background on the functionality.

    Lazy image resizing with imsize tags

    hexo-image-sizes uses a special tag format for adding images to posts in Hexo. Users add an imsize tag to their post when they want to use an image with the plugin, and in the tag they can specify various details of the image including its size profile, its alt text, its link target, and more.

    Here’s an example use of imsize within a markdown post:

    ![Arduino Duemilanove](../uploads/2010/06/ArduinoDuemilanove.jpeg)

    When Hexo renders a post containing an imsize tag, it invokes the tag’s registered function. The tag function has access to the user’s arguments and some context information including the current page on which it appears, which will be important for resolving filenames.

    The imsize tag affords the plugin a special ability. Because users must invoke the tag when they want to embed a resized image in their post, hexo-image-sizes can monitor exactly which images the user has actually made visible in his or her posts. The new version of the plugin exploits this fact to avoid resizing images that the user never makes visible on the site.

    The plugin’s operation now comprises two distinct phases during static site generation:

    1. First, while Hexo renders each post on the site, the plugin monitors all the imsize tags the user includes in posts. Each time the user includes an imsize tag, the plugin records which image the user included and the desired new size in a cache in memory.
    2. After Hexo has processed all of the posts in the site, the plugin knows which images the user wants to make visible. It generates resized versions of each visible image.

    Using Hexo’s router to avoid filesystem manipulation

    Hexo has a router module responsible for tracking which files are in use on the site and what their contents should be. The router is a map from the path of the file (relative to the Hexo site’s public output directory) to a stream of the contents of the file.

    The documentation for the router is missing some important details. First, it’s very important to know that each route corresponds to a a file in the public directory and not in the source directory. Therefore, if you add a route, you will add a file to the public site but not alter the source directory. When I first began work on this plugin, I assumed that the routes referenced files in the source directory. I learned about using the router from a similar plugin, hexo-filter-responsive-images.

    Using the router to add new resized images to the public output of the site generator has several advantages. By getting each file’s contents from the route stream managed by the router, my plugin can start to cooperate with other asset-management plugins, like image filters and minifiers. Using the router also allows me to remove the hacky filesystem code I was using before, where I had to guess the output location of each file and create the directory structure there.

    Using the router also has a downside. I’m using sharp to resize the images, and it takes as input either a file or a Buffer of image data, while the router provides a stream for each file. To run the image through sharp, I have to read the full file stream into memory (in a Buffer), which adds a lot to the memory use of the plugin. In its original form, the plugin would invoke sharp with file pointers instead of file contents, which meant that only sharp would need to see the full contents of the image, outside of the JavaScript VM. For my site, the increased memory use has not been an issue.

    Keeping Hexo running until all images are resized

    I discovered that running Hexo in server mode with hexo server can behave differently than running it to generate files and quit with hexo generate. My site would generate without issue when running the server, but hexo generate would only produce a few of the necessary images.

    The problem was with my after_generate filter function. Hexo uses the return value of filters to determine when they are finished running, and my filter wasn’t returning anything. When Hexo was running in server mode, it would keep running indefinitely, long enough to allow my plugin to finish resizing everything. In generate mode, it would exit as my plugin was just starting, because my filter function seemed to run synchronously.

    To fix the issue, I simply return a Promise from the filter function. The Promise resolves when every image has been resized.

    Normalizing file paths throughout the application

    Hexo uses several different types of file path, and mixing them was hindering my development of the plugin for a while. The following paths are involved in resizing an image:

    • The path the user put in the imsize tag, which could be:
      • Absolute (with a leading slash), meaning it references the Hexo source directory
      • Relative (no leading slash), meaning it is relative to the current post:
        • If the current post is a blog post and has an asset directory (post_asset_folder) is true in Hexo config, then the path starts in the asset directory
        • If the current post is not a blog post or post_asset_folder is false, then the path starts in the directory of the post file.

    The plugin also has access to:

    • The absolute path to Hexo’s source directory
    • The absolute path to the current post’s source file
    • The absolute path to the current post’s asset directory, if there is one

    The plugin computes:

    • A relative path from Hexo’s source directory to the image the user wants to resize. For example, a cat picture might have path /posts/2018-cats/cat1.jpg
    • A copy of that relative path with the image renamed to have its profile name as a prefix. For example, a thumbnail-sized version of the above cat picture might have path /posts/2018-cats/thumbnail-cat1.jpg

    Once we know the Hexo-relative path to the input image and the output image, we have all the information necessary to resize the image.