HTML to Document API Overview
This is the core html-to-document library for converting HTML content into professional documents (e.g., DOCX, PDF).
Installation
npm install html-to-document
Import
import { init, Converter } from 'html-to-document';
Quick Start
import { init, DocxAdapter } from 'html-to-document';
const converter = init({
adapters: {
register: [
{ format: 'docx', adapter: DocxAdapter },
],
},
// Other configuration
});
// Convert HTML string to DOCX buffer
converter.convert('<h1>Hello World</h1><p>This is a paragraph.</p>', 'docx')
.then((buffer) => {
// handle Buffer in Node.js or Blob in browsers
})
.catch(console.error);
API
init
(options?: InitOptions
): Converter
Initialize a new Converter
instance.
- options:
InitOptions
(optional)middleware?: [
Middleware](./types)[]
– custom middleware functions.tags?: { tagHandlers?: [
TagHandlerObject](./types)[]; defaultStyles?: ...; defaultAttributes?: ... }
– custom tag handlers and default tag options.
adapters?: { defaultStyles?: ...; styleMappings?: ...; register?: { format: string; adapter: [
AdapterProvider](./types); config?: object }[] }
– register adapters, style mappings, and adapter-specific config.clearMiddleware?: boolean
– clear default middleware.domParser?: [
IDOMParser](./types)
– custom DOM parser implementation.
Returns: a configured Converter
instance.
Advanced Topics
Explore further customization using the links below:
- Initialization
- Custom Tag Handlers
- Middleware
- Style Mappings & Default Styles
- Custom Converters
- Types Reference
Converter
Class for parsing and converting HTML to document formats.
new Converter(options: ConverterOptions
)
Create a Converter with raw options:
tags?: ...
– alias foroptions.tags
ininit
.adapters?: ...
registerAdapters?: { format: string; adapter: [
IDocumentConverter](./types); styleMapper: [
StyleMapper](./types) }[]
domParser?: [
IDOMParser](./types)
Methods
convert(content: string | [DocumentElement](./types)[], format: string): Promise<Buffer | Blob>
Convert HTML string or pre-parsed DocumentElement
[] to a Buffer
(Node.js) or Blob
(browser).
content
: raw HTML or array ofDocumentElement
.format
: target format key (e.g.,'docx'
,'pdf'
, etc.).
Returns: Promise<Buffer | Blob>
parse(html: string): Promise<[DocumentElement](./types)[]>
Parse raw HTML into an intermediate representation.
html
: HTML string.
Returns: Promise<[DocumentElement](./types)[]>
useMiddleware(mw: Middleware
): void
Register a middleware function to process HTML before parsing.
mw
:Middleware
function.
registerConverter(name: string, converter: IDocumentConverter
): void
Register a custom document converter adapter.
name
: format key.converter
: instance ofIDocumentConverter
.
Types
Type | Description |
---|---|
InitOptions | Options for initializing the converter via init . |
ConverterOptions | Internal options for the Converter constructor. |
Converter | Main class for conversion and parsing. |
Middleware | Asynchronous function taking an HTML string and returning a Promise of string. |
TagHandler | Handler that processes an HTMLElement with optional TagHandlerOptions and returns a DocumentElement or an array of DocumentElement . |
TagHandlerObject | { key: string; handler: TagHandler } |
DocumentElement | Union type for intermediate document elements (paragraph, heading, etc.). |
ElementType | String literal type of element kinds ('paragraph' , 'heading' , etc.). |
Styles | Map of style properties to values (string or number), with support for CSS properties. |
IDOMParser | Interface for custom DOM parser with parse(html: string): Document . |
IDocumentConverter | Interface for adapter converters. Method convert(elements: DocumentElement[]) returns a Promise resolving to a Buffer or Blob . |
AdapterProvider | Constructor type for adapters (new(deps: IConverterDependencies) => IDocumentConverter ). |
StyleMapper | Class for mapping CSS styles to document styles. |
For more details, refer to the source code.