Skip to main content

Initialization

The init function is your main entry point to configure and initialize the converter engine. It returns a Converter instance that can parse HTML and convert it into document formats like DOCX, PDF, or Markdown. Through init, you can register custom adapters, tag handlers, middleware, and default styles to control how HTML is interpreted and styled.

Quick Start

Here's a minimal example to get started:

import { init, DocxAdapter } from 'html-to-document';

const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
});


const html = '<h1>Hello</h1><p>This is a paragraph</p>';

converter.convert(html, 'docx').then((blobOrBuffer) => {
// Save or download the result
console.log('Document generated');
});

For full customization, refer to the options below.

Signature

import { init } from 'html-to-document';

declare function init(options?: InitOptions): Converter;

The options object conforms to the InitOptions type and supports the following properties:

middleware?: Middleware[]

Register one or more middleware functions to transform the HTML before parsing. Middleware lets you transform or sanitize HTML before parsing—e.g., stripping scripts, normalizing whitespace, or injecting metadata.

  • Type: Middleware[]
  • Default: [minifyMiddleware] applied automatically unless clearMiddleware is true
  • Example:
    import { init } from 'html-to-document';
    import { customMiddleware1, customMiddleware2 } from './middleware';

    const converter = init({
    middleware: [customMiddleware1, customMiddleware2],
    });

clearMiddleware?: boolean

Skips registering the default minifyMiddleware. When true, only your provided middleware functions will be used.

  • Type: boolean
  • Default: false

tags?

{
tagHandlers?: TagHandlerObject[];
defaultStyles?: { key: string; styles: Styles }[];
defaultAttributes?: { key: string; attributes: Record<string, any> }[];
}

Customize how HTML tags are parsed and styled before conversion.

  • tagHandlers: Provide custom TagHandlerObject overrides:
    const customHandler: TagHandlerObject = { /* ... */ };
    init({ tags: { tagHandlers: [customHandler] } });
  • defaultStyles: Fallback style definitions per HTML tag:
    init({ tags: {
    defaultStyles: [
    { key: 'p', styles: { marginBottom: 10, lineHeight: 1.5 } },
    ],
    } });
  • defaultAttributes: Fallback attributes per HTML tag:
    init({ tags: {
    defaultAttributes: [
    { key: 'img', attributes: { width: 600 } },
    ],
    } });

adapters?

{
register?: { format: string; adapter: AdapterProvider; config?: object }[];
defaultStyles?: { format: string; styles: Record<ElementType, Styles> }[];
styleMappings?: { format: string; handlers: StyleMapping }[];
}

Adapters determine how the parsed content is rendered into a final document format. You can register your own adapter (e.g., for Markdown) or extend existing ones like the built-in DOCX adapter. Controls which adapters are registered and how CSS styles map to document properties.

  • register: List of custom adapters implementing IDocumentConverter:

    init({ adapters: {
    register: [
    { format: 'md', adapter: MyAdapter },
    ],
    } });
  • defaultStyles: Fallback styles per element type for each format:

    init({ adapters: {
    defaultStyles: [
    { format: 'docx', styles: { paragraph: { color: 'darkblue', fontSize: 24 } } },
    ],
    } });
  • styleMappings: Custom CSS → document property mappings via StyleMapping:

    init({ adapters: {
    styleMappings: [
    { format: 'docx', handlers: { fontWeight: (v) => ({ bold: v === 'bold' }) } },
    ],
    } });
  • config: Optional adapter-specific configuration object for each registered adapter. For example, the built-in DocxAdapter supports custom block, inline, and fallthrough converters:

    init({ adapters: {
    register: [
    {
    format: 'docx',
    adapter: DocxAdapter,
    config: {
    blockConverters: [new MyBlockConverter()],
    inlineConverters: [new MyInlineConverter()],
    fallthroughConverters: [new MyFallthroughConverter()],
    },
    },
    ],
    } });

domParser?: IDOMParser

Use a custom DOM parser implementation.

  • Type: IDOMParser
  • Example:
    class CustomParser implements IDOMParser {
    parse(html: string) { /* ... */ }
    }
    init({ domParser: new CustomParser() });

Example Usage

import { init } from 'html-to-document';
import { MyAdapter } from './my-adapter';
import { customMiddleware } from './middleware';
import { CustomParser } from './parser';

const converter = init({
clearMiddleware: false,
middleware: [customMiddleware],
tags: {
defaultStyles: [{ key: 'p', styles: { marginBottom: 8 } }],
},
adapters: {
register: [{ format: 'md', adapter: MyAdapter }],
defaultStyles: [
{ format: 'md', styles: { paragraph: { indent: 20 } } },
],
styleMappings: [
{ format: 'md', handlers: { fontStyle: (v) => ({ italic: v === 'italic' }) } },
],
},
domParser: new CustomParser(), // custom DOM parser implementation
});

converter.convert('<h1>Title</h1><p>Text</p>', 'docx')
.then((buffer) => console.log('Generated DOCX:', buffer))
.catch(console.error);

Learn More