This section describes the standard tag types that apply to tagged PDFs. These standard tags provide assistive software and devices with semantic and structural elements to use to interpret document structure and present content in a useful manner.
The PDF tags architecture is extensible, so any PDF document can contain any tag set that an authoring application decides to use. For example, a PDF can have XML tags that came in from an XML schema. Custom tags that you define (such as tag names generated from paragraph styles of an authoring application) need a role map. The role map matches each custom tag to a standard tag here. When assistive software encounters a custom tag, the software can check this role map, and properly interpret the tags. Tagging PDFs by using one of the methods described here generally produces a correct role map for the document.
You can view and edit the role map of a PDF by choosing Options > Edit Role Map in the Tags panel.
The standard Adobe element tag types are available in the New Tag dialog box. They are also available in the Touch Up Properties dialog box in Acrobat Pro. Adobe strongly encourages using these tag types because they provide the best results when tagged content is converted to a different format. These formats include HTML, Microsoft Word, or an accessible text format for use by other assistive technologies.
Block-level elements are page elements that consist of text laid out in paragraph-like forms. Block-level elements are part of a document’s logical structure. Such elements are further classified as container elements, heading and paragraph elements, label and list elements, special text elements, and table elements.
Container elements are the highest level of element and provide hierarchical grouping for other block-level elements.
Document Document element. The root element of a document’s tag tree.
Part Part element. A large division of a document; may group smaller units of content together, such as division elements, article elements, or section elements.
Div Division element. A generic block-level element or group of block-level elements.
Art Article element. A self-contained body of text considered to be a single narrative.
Sect Section element. A general container element type, comparable to Division (DIV Class=“Sect”) in HTML, which is usually a component of a part element or an article element.
Heading and paragraph elements are paragraph-like, block-level elements that include specific level heading and generic paragraph (P) tags. A heading (H) element should appear as the first child of any higher-level division. Six levels of headings (H1 to H6) are available for applications that don’t hierarchically nest sections.
Label and list elements are block-level elements used for structuring lists.
L
List element. Any sequence of items of similar meaning or other relevance; immediate child elements should be list item elements.
LI List item element. Any one member of a list; may have a label element (optional) and a list body element (required) as a child.
LBL Label element. A bullet, name, or number that identifies and distinguishes an element from others in the same list.
LBody List item body element. The descriptive content of a list item.
Special text elements identify text that isn’t used as a generic paragraph (P).
BlockQuote Block quote element. One or more paragraphs of text attributed to someone other than the author of the immediate surrounding text.
Caption Caption element. A brief portion of text that describes a table or a figure.
Index Index element. A sequence of entries that contain identifying text and reference elements that point out the occurrence of the text in the main body of the document.
TOC Table of contents element. An element that contains a structured list of items and labels identifying those items; has its own discrete hierarchy.
TOCI Table of contents item element. An item contained in a list associated with a table of contents element.
Table elements are special elements for structuring tables.
Table Table element. A 2D arrangement of data or text cells that contains table row elements as child elements. It may have a caption element as its first or last child element.
TR Table row element. One row of headings or data in a table; may contain table header cell elements and table data cell elements.
TD Table data cell element. A table cell that contains nonheader data.
TH Table header cell element. A table cell that contains header text or data describing one or more rows or columns of a table.
Inline-level elements identify a span of text that has specific formatting or behavior. They are differentiated from block-level elements. Inline-level elements may be contained in or contain block-level elements.
BibEntry Bibliography entry element. A description of where some cited information may be found.
Quote Quote entry element. An inline portion of text that is attributed to someone other than the author of the surrounding text. It's different from a block quote, which is a whole paragraph or multiple paragraphs, as opposed to inline text.
Span Span entry element. Any inline segment of text; commonly used to delimit text that is associated with a set of styling properties.
Similar to inline-level elements, special inline-level elements describe an inline portion of text that has special formatting or behavior.
Code Code entry element. Computer program text embedded within a document.
Figure Figure entry element. A graphic associated with the text.
Form Form entry element. A PDF form annotation that can be or has been filled out.
Formula Formula entry element. A mathematical formula.
Link Link entry element. A hyperlink that is embedded within a document. The target can be in the same document, in another PDF document, or on a website.
Note Note entry element. Explanatory text or documentation, such as a footnote or endnote, that is referred to in the main body of text.
Reference Reference entry element. A citation to text or data that is found elsewhere in the document.