This specification depends on the WHATWG Infra standard. [INFRA]
Generally, when the specification states that a feature applies to the HTML syntax or the XML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XML)".
This specification uses the term document to refer to any use of HTML,
ranging from short static documents to long essays or reports with rich multimedia, as well as to
fully-fledged interactive applications. The term is used to refer both to
objects and their descendant DOM trees, and to serialized byte streams using the HTML syntax or the XML syntax, depending
In the context of the DOM structures, the terms HTML
document and XML document are used as defined in the DOM
specification, and refer specifically to two different modes that
can find themselves in. [DOM] (Such uses are always hyperlinked to their
In the context of byte streams, the term HTML document refers to resources labeled as
text/html, and the term XML document refers to resources labeled with an XML
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
To run steps in parallel means those steps are to be run, one after another, at the same time as other logic in the standard (e.g., at the same time as the event loop). This standard does not define the precise mechanism by which this is achieved, be it time-sharing cooperative multitasking, fibers, threads, processes, using different hyperthreads, cores, CPUs, machines, etc. By contrast, an operation that is to run immediately must interrupt the currently running task, run itself, and then resume the previously running task.
To avoid race conditions between different in parallel algorithms that operate on the same data, a parallel queue can be used.
A parallel queue represents a queue of algorithm steps that must be run in series.
A parallel queue has an algorithm queue (a queue), initially empty.
To enqueue steps to a parallel queue, enqueue the algorithm steps to the parallel queue's algorithm queue.
To start a new parallel queue, run the following steps:
Let parallelQueue be a new parallel queue.
Run the following steps in parallel:
Let steps be the result of dequeueing from parallelQueue's algorithm queue.
If steps is not nothing, then run steps.
Assert: running steps did not throw an exception, as steps running in parallel are not allowed to throw.
Implementations are not expected to implement this as a continuously running loop. Algorithms in standards are to be easy to understand and are not necessarily great for battery life or performance.
Steps running in parallel can themselves run other steps in in parallel. E.g., inside a parallel queue it can be useful to run a series of steps in parallel with the queue.
The following solution suffers from race conditions:
Let p be a new promise.
Run the following steps in parallel:
Two invocations of the above could run simultaneously, meaning name isn't in nameList during step 2.1, but it might be added before step 2.3 runs, meaning name ends up in nameList twice.
Parallel queues solve this. The standard would let nameListQueue be the result of starting a new parallel queue, then:
Let p be a new promise.
Enqueue the following steps to nameListQueue:
The steps would now queue and the race is avoided.
The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.
For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.
An MPEG-4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.
What some specifications, in particular the HTTP specification, refer to as a representation is referred to in this specification as a resource. [HTTP]
A resource's critical subresources are those that the resource needs to have available to be correctly processed. Which resources are considered critical or not is defined by the specification that defines the resource's format.
To ease migration from HTML to XML, UAs conforming to this specification
will place elements in HTML in the
http://www.w3.org/1999/xhtml namespace, at least for the purposes of the DOM and
CSS. The term "HTML elements" refers to any element in that namespace,
even in XML documents.
Except where otherwise stated, all elements defined or mentioned in this specification are in
the HTML namespace ("
http://www.w3.org/1999/xhtml"), and all
attributes defined or mentioned in this specification have no namespace.
The term element type is used to refer to the set of elements that have a given
local name and namespace. For example,
button elements are elements with the element
button, meaning they have the local name "
(implicitly as defined above) the HTML namespace.
Attribute names are said to be XML-compatible if they match the
Name production defined in XML and they contain no U+003A COLON
characters (:). [XML]
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.
A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.
The term empty, when used for an attribute value,
or string, means that the length of the text is zero (i.e., not even containing controls or U+0020 SPACE).
A node A is inserted into a node B when the insertion steps are invoked with A as the argument and A's new parent is B. Similarly, a node A is removed from a node B when the removing steps are invoked with A as the removedNode argument and B as the oldParent argument.
A node is inserted into a document when the insertion steps are invoked with it as the argument and it is now in a document tree. Analogously, a node is removed from a document when the removing steps are invoked with it as the argument and it is now no longer in a document tree.
A node becomes connected when the insertion steps are invoked with it as the argument and it is now connected. Analogously, a node becomes disconnected when the removing steps are invoked with it as the argument and it is now no longer connected.
A node is browsing-context connected when it is connected and its shadow-including root has a browsing context. A node becomes browsing-context connected when the insertion steps are invoked with it as the argument and it is now browsing-context connected. A node becomes browsing-context disconnected either when the removing steps are invoked with it as the argument and it is now no longer browsing-context connected, or when its shadow-including root no longer has a browsing context.
The construction "a
Foo object", where
actually an interface, is sometimes used instead of the more accurate "an object implementing the
An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live, then the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.
The term plugin refers to a user-agent defined set of content handlers used by the
user agent that can take part in the user agent's rendering of a
Document object, but
that neither act as child browsing contexts of the
Document nor introduce any
Node objects to the
Typically such content handlers are provided by third parties, though a user agent can also designate built-in content handlers as plugins.
A user agent must not consider the types
application/octet-stream as having a registered plugin.
One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.
This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. Indeed, this specification doesn't require user agents to support plugins at all. [NPAPI]
A plugin can be secured if it honors the semantics of
For example, a secured plugin would prevent its contents from creating pop-up
windows when the plugin is instantiated inside a sandboxed
Browsers should take extreme care when interacting with external content intended for plugins. When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.
Since different users having different sets of plugins provides a fingerprinting vector that increases the chances of users being uniquely identified, user agents are encouraged to support the exact same set of plugins for each user.
A character encoding, or just encoding where that is not ambiguous, is a defined way to convert between byte streams and Unicode strings, as defined in the WHATWG Encoding standard. An encoding has an encoding name and one or more encoding labels, referred to as the encoding's name and labels in the Encoding standard. [ENCODING]
A UTF-16 encoding is UTF-16BE or UTF-16LE. [ENCODING]
An ASCII-compatible encoding is any encoding that is not a UTF-16 encoding. [ENCODING]
Since support for encodings that are not defined in the WHATWG Encoding standard is prohibited, UTF-16 encodings are the only encodings that this specification needs to treat as not being ASCII-compatible encodings.
This specification describes the conformance criteria for user agents (relevant to implementers) and documents (relevant to authors and authoring tool implementers).
Conforming documents are those that comply with all the conformance criteria for documents. For readability, some of these conformance requirements are phrased as conformance requirements on authors; such requirements are implicitly requirements on documents: by definition, all documents are assumed to have had an author. (In some cases, that author may itself be a user agent — such user agents are subject to additional rules, as explained below.)
For example, if a requirement states that "authors must not
foobar element", it would imply that documents are not allowed to
contain elements named
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
User agents fall into several (overlapping) categories with different conformance requirements.
Web browsers that support the XML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.
A conforming Web browser would, upon finding a
script element in
an XML document, execute the script contained in that element. However, if the element is found
within a transformation expressed in XSLT (assuming the user agent also supports XSLT), then the
processor would instead treat the
script element as an opaque element that forms
part of the transform.
Web browsers that support the HTML syntax must process documents labeled with an HTML MIME type as described in this specification, so that users can interact with them.
User agents that support scripting must also be conforming implementations of the IDL fragments in this specification, as described in the Web IDL specification. [WEBIDL]
Unless explicitly stated, specifications that override the semantics of HTML
elements do not override the requirements on DOM objects representing those elements. For
script element in the example above would still implement the
User agents that process HTML and XML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.
Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.
A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.
User agents, whether interactive or not, may be designated (possibly as a user option) as supporting the suggested default rendering defined by this specification.
This is not required. In particular, even user agents that do implement the suggested default rendering are encouraged to offer settings that override this default to improve the experience for the user, e.g. changing the color contrast, using different focus styles, or otherwise making the experience more accessible and usable to the user.
User agents that are designated as supporting the suggested default rendering must, while so designated, implement the rules the rendering section defines as the behavior that user agents are expected to implement.
Implementations that do not support scripting (or which have their scripting features disabled entirely) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.
Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.
Conformance checkers must verify that a document conforms to the applicable conformance
criteria described in this specification. Automated conformance checkers are exempt from
detecting errors that require interpretation of the author's intent (for example, while a
document is non-conforming if the content of a
blockquote element is not a quote,
conformance checkers running without the input of human judgement do not have to check that
blockquote elements only contain quoted material).
Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE])
The term "HTML validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.
XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.
To put it another way, there are three types of conformance criteria:
A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.
Applications and tools that process HTML and XML documents for reasons other than to either render the documents or check them for conformance should act in accordance with the semantics of the documents that they process.
A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.
Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.
Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent. However, authoring tools must not automatically misuse elements or encourage their users to do so.
For example, it is not conforming to use an
address element for
arbitrary contact information; that element can only be used for marking up contact information
for its nearest
body element ancestor. However, since an
authoring tool is likely unable to determine the difference, an authoring tool is exempt from
that requirement. This does not mean, though, that authoring tools can use
elements for any block of italics text (for instance); it just means that the authoring tool
doesn't have to verify that when the user uses a tool for inserting contact information for an
article element, that the user really is doing that and not inserting something
In terms of conformance checking, an editor has to output documents that conform to the same extent that a conformance checker will verify.
When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.
Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).
The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.
However, WYSIWYG tools are legitimate. WYSIWYG tools should use elements they know are
appropriate, and should not use elements that they do not know to be appropriate. This might in
certain extreme cases mean limiting the use of flow elements to just a few elements, like
span and making liberal use
All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML, and one using a custom format inspired by SGML (referred to as the HTML syntax). Implementations must support at least one of these two formats, although supporting both is encouraged.
Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents. Similarly, some conformance requirements are phrased as requirements on authors; such requirements are to be interpreted as conformance requirements on the documents that authors produce. (In other words, this specification does not distinguish between conformance criteria on authors and conformance criteria on documents.)
This specification relies on several other underlying specifications.
The following terms are defined in the WHATWG Infra standard: [INFRA]
This specification introduces terminology based on the terms defined in those specifications, as described earlier.
The following terms are used as defined in the WHATWG Encoding standard: [ENCODING]
Implementations that support the XML syntax for HTML must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNS]
Data mining tools and other user agents that perform operations on content without running scripts, evaluating CSS or XPath expressions, or otherwise exposing the resulting DOM to arbitrary content, may "support namespaces" by just asserting that their DOM node analogues are in certain namespaces, without actually exposing the namespace strings.
In the HTML syntax, namespace prefixes and namespace declarations do not have the same effect as in XML. For instance, the colon has no special meaning in HTML element names.
This specification also non-normatively mentions the
interface and its
transformToDocument() methods. [XSLTP]
The following terms are defined in the WHATWG URL standard: [URL]
A number of schemes and protocols are referenced by this specification also:
The following terms are defined in the HTTP specifications: [HTTP]
The following terms are defined in the Cookie specification: [COOKIES]
The following term is defined in the Web Linking specification: [WEBLINK]
The following terms are defined in the WHATWG MIME Sniffing standard: [MIMESNIFF]
The following terms are defined in the WHATWG Fetch standard: [FETCH]
The following terms are defined in Referrer Policy: [REFERRERPOLICY]
Referrer-Policy` HTTP header
Referrer-Policy` header algorithm
no-referrer-when-downgrade", and "
unsafe-url" referrer policies
The following terms are defined in Mixed Content: [MIX]
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
The following terms are defined in the Web IDL specification:
The Web IDL specification also defines the following types that are used in Web IDL fragments in this specification:
When this specification requires a user agent to create a
representing a particular time (which could be the special value Not-a-Number), the milliseconds
component of that time, if any, must be truncated to an integer, and the time value of the newly
Date object must represent the resulting truncated time.
For instance, given the time 23045 millionths of a second after 01:00 UTC on
January 1st 2000, i.e. the time 2000-01-01T00:00:00.023045Z, then the
created representing that time would represent the same time as that created representing the
time 2000-01-01T00:00:00.023Z, 45 millionths earlier. If the given time is NaN, then the result
Date object that represents a time value NaN (indicating that the object does
not represent a specific instant of time).
The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOM]
Implementations must support DOM and the events defined in UI Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM interfaces. [DOM] [UIEVENTS]
In particular, the following features are defined in the WHATWG DOM standard: [DOM]
Node, and the concept of cloning steps used by that algorithm
MutationObserverinterface and mutation observers in general
The following features are defined in the UI Events specification: [UIEVENTS]
The following features are defined in the Touch Events specification: [TOUCH]
The following features are defined in the Pointer Events specification: [POINTEREVENTS]
This specification sometimes uses the term name to refer to the event's
type; as in, "an event named
click" or "if the event name is
keypress". The terms
"name" and "type" for events are synonymous.
The following features are defined in the DOM Parsing and Serialization specification: [DOMPARSING]
User agents are encouraged to implement the features described in the execCommand specification. [EXECCOMMAND]
The following parts of the WHATWG Fullscreen API standard are referenced from this
specification, in part to define the rendering of
dialog elements, and also to
define how the Fullscreen API interacts with HTML: [FULLSCREEN]
This specification uses the following features defined in the File API specification: [FILEAPI]
The following terms are defined in the Media Source Extensions specification: [MEDIASOURCE]
The following terms are defined in the Media Capture and Streams specification: [MEDIASTREAM]
The following features and terms are defined in the XMLHttpRequest specification: [XHR]
The following features are defined in the Battery Status API specification: [BATTERY]
While support for CSS as a whole is not required of implementations of this specification (though it is encouraged, at least for Web browsers), some features are defined in terms of specific CSS requirements.
When this specification requires that something be parsed according to a particular CSS grammar, the relevant algorithm in the CSS Syntax specification must be followed, including error handling rules. [CSSSYNTAX]
For example, user agents are required to close all open constructs upon
finding the end of a style sheet unexpectedly. Thus, when parsing the string "
rgb(0,0,0" (with a missing close-parenthesis) for a color value, the close
parenthesis is implied by this error handling rule, and a value is obtained (the color 'black').
However, the similar construct "
rgb(0,0," (with both a missing
parenthesis and a missing "blue" value) cannot be parsed, as closing the open construct does not
result in a viable value.
To parse a CSS <color> value, given a string input with an optional element element, run these steps:
If color is failure, then return failure.
If color is 'currentcolor', then:
The following terms and features are defined in the CSS specification: [CSS]
The CSS specification also defines the following border properties: [CSS]
The terms intrinsic width and intrinsic height refer to the width dimension and the height dimension, respectively, of intrinsic dimensions.
The following terms and features are defined in the CSS Logical Properties specification: [CSSLOGICAL]
The following terms and features are defined in the CSS Color specification: [CSSCOLOR]
The following features are defined in the CSS Backgrounds and Borders specification: [CSSBG]
The following features are defined in the CSS Fonts specification: [CSSFONTS]
The following features are defined in the CSS Positioned Layout specification: [CSSPOSITION]
The following features are defined in the CSS Table specification: [CSSTABLE]
The following features are defined in the CSS Text specification: [CSSTEXT]
The following features are defined in the CSS Writing Modes specification: [CSSWM]
The following features are defined in the CSS Basic User Interface specification: [CSSUI]
The following features and terms are defined in the CSS Syntax specifications: [CSSSYNTAX]
The following terms are defined in the Selectors specification: [SELECTORS]
The following features are defined in the CSS Values and Units specification: [CSSVALUES]
The following terms are defined in the CSS Cascading and Inheritance specification: [CSSCASCADE]
CanvasRenderingContext2D object's use of fonts depends on the features
described in the CSS Fonts and Font Loading specifications, including
FontFace objects and the font source concept.
The following interfaces and terms are defined in the Geometry Interfaces Module specification: [GEOMETRY]
The following term is defined in the Intersection Observer specification: [INTERSECTIONOBSERVER]
The following interface is defined in the WebGL specification: [WEBGL]
Implementations may support WebVTT as a text track format for subtitles, captions, metadata, etc., for media resources. [WEBVTT]
The following terms, used in this specification, are defined in the WebVTT specification:
The following terms are defined in the WHATWG Fetch standard: [FETCH]
The following terms are defined in the WebSocket protocol specification: [WSP]
role attribute is defined in the ARIA
specification, as are the following roles: [ARIA]
In addition, the following
attributes are defined in the ARIA specification: [ARIA]
Finally, the following terms are defined in the ARIA specification: [ARIA]
The following terms are defined in Content Security Policy: [CSP]
The following terms are defined in Service Workers: [SW]
The following algorithm is defined in Secure Contexts: [SECURE-CONTEXTS]
The following terms are defined in Feature Policy: [FEATUREPOLICY]
The following feature is defined in the Payment Request API specification: [PAYMENTREQUEST]
While support for MathML as a whole is not required by this specification (though it is encouraged, at least for Web browsers), certain features depend upon small parts of MathML being implemented. [MATHML]
The following features are defined in the MathML specification:
While support for SVG as a whole is not required by this specification (though it is encouraged, at least for Web browsers), certain features depend upon parts of SVG being implemented.
User agents that implement SVG must implement the SVG 2 specification, and not any earlier revisions.
The following features are defined in the SVG 2 specification: [SVG]
The following feature is defined in the Filter Effects specification: [FILTERS]
The following feature is defined in the Worklets specification: [WORKLETS]
This specification might have certain additional requirements on character encodings, image formats, audio formats, and video formats in the respective sections.
Vendor-specific proprietary user agent extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
All extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
For example, while strongly discouraged from doing so, an implementation could add a new IDL
typeTime" to a control that returned the time it took the user
to select the current value of a control (say). On the other hand, defining a new control that
appears in a form's
elements array would be in violation
of the above requirement, as it would violate the definition of
elements given in this specification.
When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. When someone applying this specification to their activities decides that they will recognize the requirements of such an extension specification, it becomes an applicable specification for the purposes of conformance requirements in this specification.
Someone could write a specification that defines any arbitrary byte stream as conforming, and then claim that their random junk is conforming. However, that does not mean that their random junk actually is conforming for everyone's purposes: if someone else decides that that specification does not apply to their work, then they can quite legitimately say that the aforementioned random junk is just that, junk, and not conforming at all. As far as conformance goes, what matters in a particular community is what that community agrees is applicable.
User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.
When support for a feature is disabled (e.g. as an emergency measure to mitigate a security problem, or to aid in development, or for performance reasons), user agents must act as if they had no support for the feature whatsoever, and as if the feature was not mentioned in this specification. For example, if a particular feature is accessed via an attribute in a Web IDL interface, the attribute itself would be omitted from the objects that implement that interface — leaving the attribute on the object but making it return null or throw an exception is insufficient.
Spec bugs: 18460
Implementations of XPath 1.0 that operate on HTML
documents parsed or created in the manners described in this specification (e.g. as part of
document.evaluate() API) must act as if the following edit was applied
to the XPath 1.0 specification.
First, remove this paragraph:
A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with
xmlnsis not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.
Then, insert in its place the following:
A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. If the QName has a prefix, then there must be a namespace declaration for this prefix in the expression context, and the corresponding namespace URI is the one that is associated with this prefix. It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.
If the QName has no prefix and the principal node type of the axis is element, then the default element namespace is used. Otherwise if the QName has no prefix, the namespace URI is null. The default element namespace is a member of the context for the XPath expression. The value of the default element namespace when executing an XPath expression through the DOM3 XPath API is determined in the following way:
- If the context node is from an HTML DOM, the default element namespace is "http://www.w3.org/1999/xhtml".
- Otherwise, the default element namespace URI is null.
This is equivalent to adding the default element namespace feature of XPath 2.0 to XPath 1.0, and using the HTML namespace as the default element namespace for HTML documents. It is motivated by the desire to have implementations be compatible with legacy HTML content while still supporting the changes that this specification introduces to HTML regarding the namespace used for HTML elements, and by the desire to use XPath 1.0 rather than XPath 2.0.
This change is a willful violation of the XPath 1.0 specification, motivated by desire to have implementations be compatible with legacy content while still supporting the changes that this specification introduces to HTML regarding which namespace is used for HTML elements. [XPATH10]
XSLT 1.0 processors outputting to a DOM when the output method is "html" (either explicitly or via the defaulting rule in XSLT 1.0) are affected as follows:
If the transformation program outputs an element in no namespace, the processor must, prior to constructing the corresponding DOM element node, change the namespace of the element to the HTML namespace, ASCII-lowercase the element's local name, and ASCII-lowercase the names of any non-namespaced attributes on the element.
This requirement is a willful violation of the XSLT 1.0 specification, required because this specification changes the namespaces and case-sensitivity rules of HTML in a manner that would otherwise be incompatible with DOM-based XSLT transformations. (Processors that serialize the output are unaffected.) [XSLT10]
This specification does not specify precisely how XSLT processing interacts with the HTML
parser infrastructure (for example, whether an XSLT processor acts as if it puts any
elements into a stack of open elements). However, XSLT processors must stop
parsing if they successfully complete, and must set the current document
readiness first to "
interactive" and then to "
complete" if they are aborted.
This specification does not specify how XSLT interacts with the navigation algorithm, how it fits in with the event loop, nor how error pages are to be handled (e.g. whether XSLT errors are to replace an incremental XSLT output, or are rendered inline, etc).
There are also additional non-normative comments regarding the interaction of XSLT
and HTML in the
script element section, and of
XSLT, XPath, and HTML in the
Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.
Except where otherwise stated, string comparisons must be performed in a case-sensitive manner.
A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.