Resolving markdown images with Astro

Update (7th August 2022): The RC7 release of Astro, broke the implementation shown below. But the overall idea still remains the same. You can still read the post below to get a basic idea and then read the follow up article which shows an updated implementation.

While migrating my blog to Astro, I came across a problem related to images with markdown content. Astro does not resolve images within the markdown content. One can use MDX flavored markdown to get around this, but that would require migration of existing content, which I was not willing to take up. So I googled about it and no useful results turned up. So I started looking into the Plugin API of Astro. Turns out, Astro uses ViteJS under the hood and I could write a Vite Plugin for my use case.

Writing the plugin

Initially, I added a very basic plugin to check the output generated by Astro for markdown files.

// astro.config.ts
import { defineConfig } from 'astro/config';

// https://astro.build/config
export default defineConfig({
  vite: {
    plugins: [
      {
        name: 'image-plugin',
        enforce: true,
        transform(code, id) {
          if (id.endsWith('.md')) {
            console.log(code);
          }
        },
      },
    ],
  },
});

The generated output looked something like this upon inspection:

export const metadata = {
  /* metadata */
};

export const frontmatter = {
  /* metadata */
};

export function rawContent() {
  return 'Raw content here';
}

export function compiledContent() {
  return 'Content compiled to html here';
}

Take note of the function compiledContent above. As the name suggests, it gives the final HTML output, which is written to the files. The complied content can be modified to return the desired HTML and accomplish our goal.

Consider the following example.

export function compiledContent() {
  return '<img src="./path/to/image.png" />';
}

Given the above img tag as input, we need to import the src and update it with imported reference. The updated code should look as shown below.

import image from './path/to/image.png';

export function compiledContent() {
  return `<img src="${image}" />`;
}

This is how I went about implementing this. Since the code received by the plugin is in string format, I used recast to parse and modify the code using AST.

import { parse, visit, prettyPrint } from 'recast';

// focusing only on the transform function
transform(code, id) {
  if (id.endsWith('.md')) {
    const ast = parse(code);
    visit(ast, {
      visitFunctionDeclaration(path) {
        if (path.node.id?.name === 'compiledContent') {
          const returnStatement = path.node.body.body[0];

          if (returnStatement.type === 'ReturnStatement' && returnStatement.argument) {
            // TODO: modify the returnStatement.argument
          }
        }

        return false;
      },
    });

    const finalCode = `${imgImports.join('\n')}\n${prettyPrint(ast).code}`;

    return {
      code: finalCode,
    };
  }
}

returnStatement.argument is the return statement of the compiledContent function. It can be converted to a string using recast as prettyPrint(returnStatement.argument).code. Now, this can be used to modify the HTML as per the requirements.

Using a regex, I updated all the instances of the image tag and collected the import statements in an array. My final code looked something like this:

import path from 'node:path';
import { parse, visit, prettyPrint } from 'recast';
import { camelCase } from 'change-case';

const IMG_REGEX = /<img\s.*?(src=('|")(.*?)(\2)).*?>/g;

function processHTMLContent(content: string, imgImports: string[]) {
  const newContent = content.replace(IMG_REGEX, (imgTag, fullSrc, _0, src) => {
    const variableName = camelCase(path.basename(src));

    imgImports.push(`import ${variableName} from "${src}";`);

    const updatedImg = imgTag.replace(fullSrc, 'src="${' + variableName + '}"');

    return updatedImg;
  });

  return newContent;
}

// focusing only on the transform function here
transform(code, id) {
  if (id.endsWith('md')) {
    const imgImports = []; // collecting all the imports here
    const ast = parse(code);

    visit(ast, {
      visitFunctionDeclaration(path) {
        if (path.node.id?.name === 'compiledContent') {
          const returnStatement = path.node.body.body[0];

          if (returnStatement.type === 'ReturnStatement' && returnStatement.argument) {
            const { code } = prettyPrint(returnStatement.argument, printOptions);
            const processedHTML = processHTMLContent(code, imgImports);

            returnStatement.argument = parse(processedHTML).program.body[0];
          }
        }

        return false;
      },
    });

    const finalCode = `${imgImports.join('\n')}\n${prettyPrint(ast).code}`;

    return {
      code: finalCode,
    };
  }
}

This worked out pretty well for my setup. I know that there might be some edge cases with regex matching, but I’ll cross that bridge when I come to it. Probably I can switch to using AST-based tools like posthtml instead of using regex.

astrovite