Skip to main content

HTML to Document API Overview

This is the core html-to-document library for converting HTML content into professional documents (e.g., DOCX, PDF).

Installation

npm install html-to-document

Import

import { init, Converter } from 'html-to-document';

Quick Start

import { init, DocxAdapter } from 'html-to-document';

const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
// Other configuration
});

// Convert HTML string to DOCX buffer
converter.convert('<h1>Hello World</h1><p>This is a paragraph.</p>', 'docx')
.then((buffer) => {
// handle Buffer in Node.js or Blob in browsers
})
.catch(console.error);

API

init(options?: InitOptions): Converter

Initialize a new Converter instance.

  • options: InitOptions (optional)
    • middleware?: [Middleware](./types)[] – custom middleware functions.
    • tags?: { tagHandlers?: [TagHandlerObject](./types)[]; defaultStyles?: ...; defaultAttributes?: ... } – custom tag handlers and default tag options.
  • adapters?: { defaultStyles?: ...; styleMappings?: ...; register?: { format: string; adapter: [AdapterProvider](./types); config?: object }[] } – register adapters, style mappings, and adapter-specific config.
    • clearMiddleware?: boolean – clear default middleware.
    • domParser?: [IDOMParser](./types) – custom DOM parser implementation.

Returns: a configured Converter instance.

Advanced Topics

Explore further customization using the links below:

Converter

Class for parsing and converting HTML to document formats.

new Converter(options: ConverterOptions)

Create a Converter with raw options:

  • tags?: ... – alias for options.tags in init.
  • adapters?: ...
  • registerAdapters?: { format: string; adapter: [IDocumentConverter](./types); styleMapper: [StyleMapper](./types) }[]
  • domParser?: [IDOMParser](./types)

Methods

convert(content: string | [DocumentElement](./types)[], format: string): Promise<Buffer | Blob>

Convert HTML string or pre-parsed DocumentElement[] to a Buffer (Node.js) or Blob (browser).

  • content: raw HTML or array of DocumentElement.
  • format: target format key (e.g., 'docx', 'pdf', etc.).

Returns: Promise<Buffer | Blob>

parse(html: string): Promise<[DocumentElement](./types)[]>

Parse raw HTML into an intermediate representation.

  • html: HTML string.

Returns: Promise<[DocumentElement](./types)[]>

useMiddleware(mw: Middleware): void

Register a middleware function to process HTML before parsing.

registerConverter(name: string, converter: IDocumentConverter): void

Register a custom document converter adapter.

Types

TypeDescription
InitOptionsOptions for initializing the converter via init.
ConverterOptionsInternal options for the Converter constructor.
ConverterMain class for conversion and parsing.
MiddlewareAsynchronous function taking an HTML string and returning a Promise of string.
TagHandlerHandler that processes an HTMLElement with optional TagHandlerOptions and returns a DocumentElement or an array of DocumentElement.
TagHandlerObject{ key: string; handler: TagHandler }
DocumentElementUnion type for intermediate document elements (paragraph, heading, etc.).
ElementTypeString literal type of element kinds ('paragraph', 'heading', etc.).
StylesMap of style properties to values (string or number), with support for CSS properties.
IDOMParserInterface for custom DOM parser with parse(html: string): Document.
IDocumentConverterInterface for adapter converters. Method convert(elements: DocumentElement[]) returns a Promise resolving to a Buffer or Blob.
AdapterProviderConstructor type for adapters (new(deps: IConverterDependencies) => IDocumentConverter).
StyleMapperClass for mapping CSS styles to document styles.

For more details, refer to the source code.