Converter - XPath Guide
XPath (XML Path Language) is the primary method for selecting elements in the StAX-XML converter. This guide covers all supported XPath patterns and best practices.
For a spec-oriented support summary, see the XPath 1.0 conformance matrix.
What is XPath?
Section titled “What is XPath?”XPath is a query language for selecting nodes in XML documents. Think of it like CSS selectors for XML:
- CSS Selector:
div.container > p - XPath:
/div[@class='container']/p
The converter uses XPath to specify which elements to extract from your XML.
Why XPath?
Section titled “Why XPath?”XPath provides:
- Precision: Target exactly the elements you need
- Flexibility: Handle complex XML structures easily
- Standard: Well-documented, widely understood
- Power: Filter elements by attributes, position, and content
XPath Basics
Section titled “XPath Basics”Absolute Paths
Section titled “Absolute Paths”Start from the document root with /:
// Select <title> directly under <book>x.string().xpath('/book/title')
// XML: <book><title>1984</title></book>// Result: "1984"
// Nested pathx.string().xpath('/library/book/title')
// XML: <library><book><title>1984</title></book></library>// Result: "1984"Descendant Search
Section titled “Descendant Search”Use // to search anywhere in the document:
// Find <title> at any depthx.string().xpath('//title')
// XML: <root><section><book><title>1984</title></book></section></root>// Result: "1984"
// Useful for unknown structuresx.array(x.string(), '//error') // All error messages⚠️ Performance Note: // searches the entire document. Use absolute paths when possible for better performance.
Multiple descendant steps are supported when the converter uses the XPath 1.0 runtime evaluator:
x.string().xpath('//root//books')x.array(x.string(), '//section//item')
// Still faster when the structure is knownx.string().xpath('/root/catalog/books')Performance note: full XPath 1.0 expressions are evaluated over a lightweight document tree. Simple converter selectors can still use the compiled event-table fast path.
Relative Paths
Section titled “Relative Paths”Use ./ for paths relative to current context:
const specs = x.object({ cpu: x.string().xpath('./cpu'), // Relative to specs ram: x.string().xpath('./ram'), // Relative to specs storage: x.string().xpath('./storage') // Relative to specs}).xpath('/product/specs');
// XML:// <product>// <specs>// <cpu>Intel i7</cpu>// <ram>16GB</ram>// <storage>512GB</storage>// </specs>// </product>Relative paths only work when the parent object/array has an XPath set.
Selecting Attributes
Section titled “Selecting Attributes”Simple Attributes
Section titled “Simple Attributes”Use /@ to select attribute values:
// Select id attributex.string().xpath('/book/@id')
// XML: <book id="123">Title</book>// Result: "123"
// Nested attributex.number().xpath('/product/item/@price')
// XML: <product><item price="19.99">Widget</item></product>// Result: 19.99Descendant Attributes
Section titled “Descendant Attributes”Search for attributes anywhere with //@:
// Find any href attributex.string().xpath('//@href')
// XML: <html><body><a href="http://example.com">Link</a></body></html>// Result: "http://example.com"
// All id attributesx.array(x.string(), '//@id')
// XML: <root><item id="1"/><item id="2"/></root>// Result: ["1", "2"]Attribute in Objects
Section titled “Attribute in Objects”const book = x.object({ id: x.number().xpath('/book/@id'), title: x.string().xpath('/book/title'), category: x.string().xpath('/book/@category')});
// XML: <book id="123" category="fiction"><title>1984</title></book>// Result: { id: 123, title: "1984", category: "fiction" }XPath Predicates
Section titled “XPath Predicates”Predicates filter elements based on conditions using [...].
Attribute Value Predicates
Section titled “Attribute Value Predicates”// Books with category="fiction"const fictionBooks = x.array( x.object({ title: x.string().xpath('./title'), author: x.string().xpath('./author') }), '//book[@category="fiction"]');
// XML:// <library>// <book category="fiction"><title>1984</title><author>Orwell</author></book>// <book category="science"><title>Brief History</title><author>Hawking</author></book>// </library>//// Result: [{ title: "1984", author: "Orwell" }]Multiple Conditions
Section titled “Multiple Conditions”// Products that are available and in stockx.array( x.object({...}), '//product[@available="true"][@inStock="true"]');
// Or combine with 'and'x.array( x.object({...}), '//product[@available="true" and @inStock="true"]');Position Predicates
Section titled “Position Predicates”// First bookx.object({...}).xpath('//book[1]')
// Second bookx.object({...}).xpath('//book[2]')
// Third bookx.object({...}).xpath('//book[3]')Position functions are also supported by the XPath 1.0 runtime evaluator:
x.object({...}).xpath('//book[last()]')x.object({...}).xpath('//book[position() = 2]')XPath with Different Schemas
Section titled “XPath with Different Schemas”String Schema
Section titled “String Schema”// Element content (includes nested elements)x.string().xpath('/message')
// Direct text content only (excludes nested elements)x.string().xpath('/message/text()')
// Example difference:// XML: <div>Hello <span>World</span></div>x.string().xpath('/div') // "Hello World" (all text)x.string().xpath('/div/text()') // "Hello " (direct text only)
// Attributex.string().xpath('/@type')
// Nested elementx.string().xpath('/response/data/value')
// Descendantx.string().xpath('//error')Number Schema
Section titled “Number Schema”// Parse numeric contentx.number().xpath('/product/price')
// Parse numeric attributex.number().xpath('/item/@quantity')
// With validationx.number().xpath('//age').min(0).max(120)Object Schema - Two Approaches
Section titled “Object Schema - Two Approaches”Approach 1: Absolute paths in fields
const user = x.object({ name: x.string().xpath('/user/name'), email: x.string().xpath('/user/email'), age: x.number().xpath('/user/age')});Approach 2: Object XPath with relative fields (Recommended)
const user = x.object({ name: x.string().xpath('./name'), email: x.string().xpath('./email'), age: x.number().xpath('./age')}).xpath('/user');Both produce the same result, but Approach 2 is more maintainable.
Array Schema
Section titled “Array Schema”Arrays require XPath to specify which elements to collect:
// Array of stringsx.array(x.string(), '//item')
// Array of numbersx.array(x.number(), '//value')
// Array of objects with predicatex.array( x.object({ name: x.string().xpath('./name'), price: x.number().xpath('./price') }), '//product[@available="true"]')Real-World XPath Examples
Section titled “Real-World XPath Examples”RSS Feed Parsing
Section titled “RSS Feed Parsing”const rss = x.object({ channelTitle: x.string().xpath('/rss/channel/title'), channelLink: x.string().xpath('/rss/channel/link'), items: x.array( x.object({ title: x.string().xpath('./title'), link: x.string().xpath('./link'), description: x.string().xpath('./description'), pubDate: x.string().xpath('./pubDate') }), '/rss/channel/item' )});SVG Document
Section titled “SVG Document”const svg = x.object({ width: x.number().xpath('/svg/@width'), height: x.number().xpath('/svg/@height'), circles: x.array( x.object({ cx: x.number().xpath('./@cx'), cy: x.number().xpath('./@cy'), r: x.number().xpath('./@r'), fill: x.string().xpath('./@fill') }), '//circle' ), rectangles: x.array( x.object({ x: x.number().xpath('./@x'), y: x.number().xpath('./@y'), width: x.number().xpath('./@width'), height: x.number().xpath('./@height') }), '//rect' )});Configuration File
Section titled “Configuration File”const config = x.object({ appName: x.string().xpath('/config/app/name'), version: x.string().xpath('/config/app/version'), database: x.object({ host: x.string().xpath('./host'), port: x.number().xpath('./port').int(), name: x.string().xpath('./database'), credentials: x.object({ username: x.string().xpath('./username'), password: x.string().xpath('./password') }).xpath('./credentials') }).xpath('/config/database'), features: x.array( x.object({ name: x.string().xpath('./@name'), enabled: x.string().xpath('./@enabled').transform(v => v === 'true') }), '/config/features/feature' )});E-Commerce Product Catalog
Section titled “E-Commerce Product Catalog”const catalog = x.object({ storeName: x.string().xpath('/catalog/@name'), categories: x.array( x.object({ id: x.number().xpath('./@id'), name: x.string().xpath('./@name'), products: x.array( x.object({ id: x.number().xpath('./@id'), name: x.string().xpath('./name'), price: x.number().xpath('./price').min(0), inStock: x.string().xpath('./@inStock').transform(v => v === 'true'), tags: x.array(x.string(), './tag') }), './product' ) }), '/catalog/category' )});HTML-like Document
Section titled “HTML-like Document”const html = x.object({ title: x.string().xpath('/html/head/title'), metaDescription: x.string().xpath('/html/head/meta[@name="description"]/@content'), links: x.array( x.object({ href: x.string().xpath('./@href'), text: x.string().xpath('.') }), '//a' ), images: x.array( x.string().xpath('./@src'), '//img' )});XPath Best Practices
Section titled “XPath Best Practices”1. Be Specific
Section titled “1. Be Specific”// ❌ Too broad - matches all namesx.string().xpath('//name')
// ✅ Specific pathx.string().xpath('/user/profile/name')
// ✅ With predicatex.string().xpath('//user[@role="admin"]/name')2. Use Relative Paths with Objects
Section titled “2. Use Relative Paths with Objects”// ❌ Repetitive absolute pathsconst user = x.object({ name: x.string().xpath('/user/profile/details/name'), email: x.string().xpath('/user/profile/details/email'), phone: x.string().xpath('/user/profile/details/phone')});
// ✅ Cleaner with object XPathconst user = x.object({ name: x.string().xpath('./name'), email: x.string().xpath('./email'), phone: x.string().xpath('./phone')}).xpath('/user/profile/details');3. Use Predicates for Filtering
Section titled “3. Use Predicates for Filtering”// ❌ Get all products, filter in codeconst products = x.array(x.object({...}), '//product');const available = products.filter(p => p.available);
// ✅ Filter with XPathconst available = x.array( x.object({...}), '//product[@available="true"]');4. Prefer Absolute Over Descendant
Section titled “4. Prefer Absolute Over Descendant”// ❌ Slower - searches entire documentx.string().xpath('//deeply/nested/element')
// ✅ Faster - direct pathx.string().xpath('/root/section/deeply/nested/element')Use // only when:
- Structure is unknown or variable
- Element can appear at multiple levels
- Convenience outweighs performance
5. Combine Attributes and Elements
Section titled “5. Combine Attributes and Elements”const book = x.object({ id: x.string().xpath('./@id'), // Attribute isbn: x.string().xpath('./@isbn'), // Attribute title: x.string().xpath('./title'), // Element author: x.string().xpath('./author'), // Element category: x.string().xpath('./@category') // Attribute}).xpath('//book');Common XPath Patterns Cheat Sheet
Section titled “Common XPath Patterns Cheat Sheet”| Pattern | Example | Description |
|---|---|---|
/element | /book | Root element |
/parent/child | /book/title | Direct child |
//element | //title | Anywhere in document |
/@attr | /@id | Attribute of current element |
//@attr | //@href | Attribute anywhere |
/element/@attr | /book/@id | Specific element’s attribute |
//element[@attr="value"] | //book[@category="fiction"] | Element with attribute value |
//element[1] | //book[1] | First matching element |
//element[2] | //book[2] | Second matching element |
/element/text() | /message/text() | Direct text content only |
./child | ./name | Relative child |
./@attr | ./@id | Relative attribute |
//element[@a="x"][@b="y"] | //product[@available="true"][@inStock="true"] | Multiple conditions |
XPath 1.0 Coverage
Section titled “XPath 1.0 Coverage”The converter supports XPath 1.0 expressions through its runtime evaluator. This includes:
- All 13 XPath axes:
ancestor,ancestor-or-self,attribute,child,descendant,descendant-or-self,following,following-sibling,namespace,parent,preceding,preceding-sibling, andself - Node tests:
node(),text(),comment(),processing-instruction(), name tests, and wildcards - Predicates with
position()andlast() - Boolean, equality, relational, and arithmetic expressions
- XPath 1.0 core functions such as
contains(),starts-with(),substring(),normalize-space(),translate(),count(),sum(),name(),local-name(), andnamespace-uri() - Namespaced element and attribute names via
ParseOptions.xpathNamespaces
const schema = x.string() .xpath('string(//p:book[last()]/p:title)') .parseSync(xml, { xpathNamespaces: { p: 'urn:books' } });XPath 1.0 follows its own namespace rule: an unprefixed name test matches only nodes with no namespace. For default-namespaced XML, bind the namespace to a prefix in xpathNamespaces and use that prefix in the XPath expression.
Full XPath expressions are not streamed directly. When an expression cannot use the compiled fast path, the converter builds a lightweight in-memory document representation first and evaluates XPath against that tree.
Troubleshooting XPath
Section titled “Troubleshooting XPath”No Match Returns Empty/NaN
Section titled “No Match Returns Empty/NaN”const schema = x.string().xpath('/missing');const result = schema.parseSync('<root></root>');// Result: "" (empty string)
const numSchema = x.number().xpath('/missing');const numResult = numSchema.parseSync('<root></root>');// Result: NaNUse .optional() to get undefined instead:
x.string().xpath('/missing').optional();// Result: undefinedWrong Element Selected
Section titled “Wrong Element Selected”// ❌ Gets first name foundx.string().xpath('//name')// Could match: /user/name OR /company/name
// ✅ Be specificx.string().xpath('/user/name')Array Returns Empty
Section titled “Array Returns Empty”const items = x.array(x.string(), '//item').parseSync('<root></root>');// Result: []
// Check your XPath matches elements// Verify XML structureNext Steps
Section titled “Next Steps”- Learn about Transformations for post-processing
- See Schema Types for validation options
- Explore Examples for real-world XPath usage
- Check Writing XML for serialization