Skip to main content

Middleware

middleware is deprecated and kept as a compatibility layer for the newer plugin system.

Middleware functions still run on the HTML string before it is parsed into DocumentElement nodes, but internally each middleware entry is adapted into a plugin with a beforeParse hook.

Use Plugins for new code.

Signature

import { Middleware } from 'html-to-document';

type Middleware = (html: string) => Promise<string>;

See the Types Reference for the full definition.

Default Middleware Behavior

The old built-in whitespace minifier now exists as the default minify plugin. clearMiddleware: true still disables it by default because it implies enableDefaultPlugins: false unless you explicitly override that.

The default behavior is still:

  • Strips HTML comments (<!-- ... -->)
  • Collapses consecutive whitespace into a single space (outside <pre>)
  • Removes unnecessary whitespace between tags
  • Trims leading and trailing whitespace

Custom Middleware

There are two legacy ways to register middleware:

1. Via init options

import { init } from 'html-to-document';

// Example: remove all <script> tags
const stripScripts: Middleware = async (html) =>
html.replace(/<script[\s\S]*?>[\s\S]*?<\/script>/g, '');

const converter = init({
clearMiddleware: true, // skip default minifier
middleware: [stripScripts],
});

2. Programmatically

const converter = init();
// Add another middleware after initialization
converter.useMiddleware(stripScripts);

Note: Middleware functions are executed in the order they are passed in or registered. Make sure to arrange them accordingly if one depends on the output of another.

When both plugins and deprecated middleware are provided through init() or the Converter constructor, plugin beforeParse hooks run first and adapted middleware runs after them.

Example: Sanitizing HTML

import { init } from 'html-to-document';

// Remove all inline styles
const removeStyles: Middleware = async (html) =>
html.replace(/ style="[^"]*"/g, '');

const converter = init({
middleware: [removeStyles],
});

converter.convert('<p style="color:red">Hello</p>', 'docx')
.then(buffer => /* ... */)
.catch(console.error);

Migration to Plugins

const converter = init({
plugins: [
{
beforeParse: async (context) => {
context.setHtml(context.html.replace(/ style="[^"]*"/g, ''));
},
},
],
});