The Deep Checker script (deep-check.pl)
provides additional validation on DocBook XML files. Standard
validation is provided through the valid
target
on a document. However, this validation is limited to ensuring
that the XML is valid XML, that all of the files included or
referred to (including image files) are available, and that the
different entities in the file can be parsed.
The standard validation doesn't check whether the definitions of the different elements are valid, or whether problems with specific combinations of different DocBook sequences will cause issues in certain target formats.
The Deep Checker script addresses these issues by parsing the DocBook XML and then identifying different issues and reporting on them. The script checks a for a variety of problems, including:
Missing IDs on chapter and section elements.
Checks for duplicate IDs in a single file.
Checks the format and definition of tables (both
table
and
informaltable
). In particular, it checks:
Compares the column definitions
(colspec
), and the actual number of
columns defined in the table.
Calculates the specified width of the table and ensures that it is not more than 100% of the width of page.
Checks that specified images exist and that the image files available on the file system and those specified match.
Checks the line length of programlisting
elements to highlight potential line wrap in PDF documents.
The line length can be configured on the command line.
Checks for empty link
references. A link
reference does not automatically populate the link text, so an
empty link produces a link with the link text as question
marks.
If you specifically request it, you can also check the
ulink
elements, which point to standard
Internet URLs. The checking process accesses each URL and then
verifies whether the URL is still reachable.
Errors reported by the Deep Checker script are reported in relation to the section in which they appear. This enbles you to identify more easily where a problem is within the DocBook XML source.