libhtml – simple HTML parsing library


libhtml is a minimal, open source (ISC-licensed) C library for parsing, serialising, and manipulating HTML-4.01-strict and XHTML-1.0-strict documents. You may enjoy this library if you're interested in tight correctness of input and output data.

Why? The predominant open source HTML parser, libxml, is enormous and complicated. For our needs, we wanted a small, strongly-validating parser focussing only on strict HTML.

The libhtml library is a Project member.


Sources correctly build and install on OpenBSD, NetBSD, and GNU/Linux operating systems, tested variously on generic i386, AMD64, and DEC Alpha. The current version is 0.3.3.


Current source libhtml.tar.gz (md5)
Archived source /archive
On-line source cvsweb


The manual is generated automatically and refers to the current snapshot. Examples are distributed with the source package.

html(3) simple HTML parsing library
test.c example interfacing utility


For all issues related to libhtml, contact Kristaps,


12-07-2011: version 0.3.3

Fixed hnode_dechildpart logic.

05-04-2011: version 0.3.2

Pushed commonly-used struct iofd into library with the iofd_close, iofd_getchar, iofd_open, and iofd_rew functions. Also added iostdout_putchar and iostdout_puts convenience functions.

05-02-2011: version 0.3.1

Reviving project. Added some helper functions and cleaned up manual.

See cvsweb for historical notes.

Copyright © 2009, 2010, 2011 Kristaps Dzonsons, $Date: 2011/07/12 16:03:19 $