HTML to Document API Overview
This is the core html-to-document library for converting HTML content into professional documents (e.g., DOCX, PDF).
Installation
npm install html-to-document
Import
import { init, Converter } from 'html-to-document';
Quick Start
import { init, DocxAdapter } from 'html-to-document';
const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
// Other configuration
});
// Convert HTML string to DOCX buffer
converter.convert('<h1>Hello World</h1><p>This is a paragraph.</p>', 'docx')
.then((buffer) => {
// handle Buffer in Node.js or Blob in browsers
})
.catch(console.error);
API
init(options?: InitOptions): Converter
Initialize a new Converter instance.
- options:
InitOptions(optional)middleware?: [Middleware](./types)[]– custom middleware functions.tags?: { tagHandlers?: [TagHandlerObject](./types)[]; defaultStyles?: ...; defaultAttributes?: ... }– custom tag handlers and default tag options.
adapters?: { defaultStyles?: ...; styleMappings?: ...; register?: { format: string; adapter: [AdapterProvider](./types); config?: object }[] }– register adapters, style mappings, and adapter-specific config.clearMiddleware?: boolean– clear default middleware.domParser?: [IDOMParser](./types)– custom DOM parser implementation.
Returns: a configured Converter instance.
Advanced Topics
Explore further customization using the links below:
- Initialization
- Custom Tag Handlers
- Middleware
- Style Mappings & Default Styles
- Custom Converters
- Types Reference
Converter
Class for parsing and converting HTML to document formats.
new Converter(options: ConverterOptions)
Create a Converter with raw options:
tags?: ...– alias foroptions.tagsininit.adapters?: ...registerAdapters?: { format: string; adapter: [IDocumentConverter](./types); styleMapper: [StyleMapper](./types) }[]domParser?: [IDOMParser](./types)
Methods
convert(content: string | [DocumentElement](./types)[], format: string): Promise<Buffer | Blob>
Convert HTML string or pre-parsed DocumentElement[] to a Buffer (Node.js) or Blob (browser).
content: raw HTML or array ofDocumentElement.format: target format key (e.g.,'docx','pdf', etc.).
Returns: Promise<Buffer | Blob>
parse(html: string): Promise<[DocumentElement](./types)[]>
Parse raw HTML into an intermediate representation.
html: HTML string.
Returns: Promise<[DocumentElement](./types)[]>
useMiddleware(mw: Middleware): void
Register a middleware function to process HTML before parsing.
mw:Middlewarefunction.
registerConverter(name: string, converter: IDocumentConverter): void
Register a custom document converter adapter.
name: format key.converter: instance ofIDocumentConverter.
Types
| Type | Description |
|---|---|
InitOptions | Options for initializing the converter via init. |
ConverterOptions | Internal options for the Converter constructor. |
Converter | Main class for conversion and parsing. |
Middleware | Asynchronous function taking an HTML string and returning a Promise of string. |
TagHandler | Handler that processes an HTMLElement with optional TagHandlerOptions and returns a DocumentElement or an array of DocumentElement. |
TagHandlerObject | { key: string; handler: TagHandler } |
DocumentElement | Union type for intermediate document elements (paragraph, heading, etc.). |
ElementType | String literal type of element kinds ('paragraph', 'heading', etc.). |
Styles | Map of style properties to values (string or number), with support for CSS properties. |
IDOMParser | Interface for custom DOM parser with parse(html: string): Document. |
IDocumentConverter | Interface for adapter converters. Method convert(elements: DocumentElement[]) returns a Promise resolving to a Buffer or Blob. |
AdapterProvider | Constructor type for adapters (new(deps: IConverterDependencies) => IDocumentConverter). |
StyleMapper | Class for mapping CSS styles to document styles. |
For more details, refer to the source code.