Skip to main content

HTML to Document API Overview

This is the core html-to-document library for converting HTML content into professional documents (e.g., DOCX, PDF).

Installation

npm install html-to-document

Import

import { init, Converter } from 'html-to-document';

Quick Start

import { init, DocxAdapter } from 'html-to-document';

const converter = init({
adapters: {
register: [{ format: 'docx', adapter: DocxAdapter }],
},
// Other configuration
});

// Convert HTML string to DOCX buffer
converter
.convert('<h1>Hello World</h1><p>This is a paragraph.</p>', 'docx')
.then((buffer) => {
// handle Buffer in Node.js or Blob in browsers
})
.catch(console.error);

API

init(options?: InitOptions): Converter

Initialize a new Converter instance.

  • options: InitOptions (optional)
  • plugins?: [Plugin](./types)[] – plugin hooks for HTML preprocessing, document inspection, and post-parse element replacement.
    • enableDefaultPlugins?: boolean – enable or disable built-in plugins.
    • middleware?: [Middleware](./types)[] – deprecated middleware compatibility layer.
    • tags?: { tagHandlers?: [TagHandlerObject](./types)[]; defaultStyles?: ...; defaultAttributes?: ... } – custom tag handlers and default tag options.
  • adapters?: { defaultStyles?: ...; register?: { format: string; adapter: [AdapterProvider](./types); config?: object; createAdapter?: ... }[] } – register adapters, customize construction per adapter, and pass adapter-specific config.
    • clearMiddleware?: boolean – deprecated legacy switch that disables default plugins by implication.
    • domParser?: [IDOMParser](./types) – custom DOM parser implementation.

Returns: a configured Converter instance.

Advanced Topics

Explore further customization using the links below:

Converter

Class for parsing and converting HTML to document formats.

new Converter(options: ConverterOptions)

Create a Converter with raw options:

  • tags?: ... – alias for options.tags in init.
  • plugins?: [Plugin](./types)[]
  • enableDefaultPlugins?: boolean
  • middleware?: [Middleware](./types)[] (deprecated)
  • clearMiddleware?: boolean (deprecated)
  • adapters?: ...
  • registerAdapters?: { format: string; adapter: [IDocumentConverter](./types) }[]
  • domParser?: [IDOMParser](./types)

Methods

convert(content: string | [DocumentElement](./types)[], format: string): Promise<Buffer | Blob>

Convert HTML string or pre-parsed DocumentElement[] to a Buffer (Node.js) or Blob (browser).

  • content: raw HTML or array of DocumentElement.
  • format: target format key (e.g., 'docx', 'pdf', etc.).

Returns: Promise<Buffer | Blob>

parse(html: string): Promise<[DocumentElement](./types)[]>

Parse raw HTML into an intermediate representation.

  • html: HTML string.

Returns: Promise<[DocumentElement](./types)[]>

useMiddleware(mw: Middleware): void

Register a middleware function to process HTML before parsing.

usePlugin(plugin: Plugin): void

Register a plugin after construction.

  • plugin: Plugin object with beforeParse, onDocument, and/or afterParse hooks.
registerConverter(name: string, converter: IDocumentConverter): void

Register a custom document converter adapter.

Types

TypeDescription
InitOptionsOptions for initializing the converter via init.
ConverterOptionsInternal options for the Converter constructor.
ConverterMain class for conversion and parsing.
PluginOptional beforeParse and afterParse hooks for extending the conversion pipeline.
MiddlewareAsynchronous function taking an HTML string and returning a Promise of string.
TagHandlerHandler that processes an HTMLElement with optional TagHandlerOptions and returns a DocumentElement or an array of DocumentElement.
TagHandlerObject{ key: string; handler: TagHandler }
DocumentElementUnion type for intermediate document elements (paragraph, heading, etc.).
ElementTypeString literal type of element kinds ('paragraph', 'heading', etc.).
StylesMap of style properties to values (string or number), with support for CSS properties.
IDOMParserInterface for custom DOM parser with parse(html: string): Document.
IDocumentConverterInterface for adapter converters. Method convert(elements: DocumentElement[]) returns a Promise resolving to a Buffer or Blob.
AdapterProviderConstructor type for adapters (new(deps: IConverterDependencies) => IDocumentConverter).

For more details, refer to the source code.