Schedule for Friday
9:00 | Registration desk opens |
9:30 | Opening and sponsors presentation |
9:40 | Stormy First Draft Marta Bartnicka |
10:10 | AI for XML Development – Advantages and Challenges Octavian Nadolu |
10:40 | XML in a GenAI World: Features & Follies Dave Gullo |
11:10 | Coffee break |
11:30 | Navigating and Updating Trees of Maps and Arrays Michael Kay |
12:15 | JSONPath: an IETF Proposed Standard, with comparisons to XPath Alan Painter |
12:45 | Containerizing XML Build Tools to Facilitate CI/CD Edward Porter |
13:00 | Lunch |
14:30 | QTI and InDesign Mark Dunn |
15:00 | Roundtrip LwDITA Document Editing with Petal Alexandra von Criegern, Younes Bahloul and Adam Retter |
15:30 | A publishing environment with exist-db Juri Leino |
16:00 | Coffee break |
16:30 | XMQ/HTMQ – see XML and HTML in a new light Fredrik Öhrström |
17:00 | <custom-element> as no-JS browser applications engine Sasha Firsov |
17:30 | Why Adding Some CSS Isn’t Enough Anne Rudolf |
18:00 | Closing of the day |
19:00 | Social Dinner – different location than last time!!! |
20:30 | DemoJam |
Session details
Stormy First Draft
Marta Bartnicka
Stormy is a web tool developed to deliver generative AI for our internal users with two goals:
– enhance baseline AI with company-specific technology information,
– protect the company intellectual property.
My presentation will cover the following key points:
– The genesis of the StormyAI project: the two goals explained
– How we develop the tool: web UI, generative AI models under the hood, and how we “teach” genAI about our technologies
– How we capture use cases of information developers, support, and engineers: summary of learnings from alpha and beta phases
– Copyright and Licensing considerations for AI-generated content
The presentation will include live demo of Stormy AI – in real time if I can get into company VPN, or from a back-up recoded demo. A significant part of the demo will be generating first draft of DITA XML documentation.
AI for XML Development – Advantages and Challenges
Octavian Nadolu
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
XML in a GenAI World: Features & Follies
Dave Gullo
Generative AI has exploded lately and holds tremendous potential with respect to media generation. While it is possible with most models to ingest and export to a myriad of formats including XML, the out-of-the-box LLM experiences with regard to generating reliable XML can range from reliable to comical.
Our company uses a DSL (Domain Specific language) defined by XSD to specify the words and actions necessary to generate human-like software videos. We use XML as the declaration format for these videos, along with XSLT to transform many other formats into our parlance.
We’ll take a dive into first-experiences with a number of LLMs to see how they stack up in terms of intelligent export to XML. Some surprising effects emerge, such as hallucination of namespaces, attributes and elements; and some theories as to why. We’ll incorporate short videos demonstrating the behaviors of various LLMs, and prompting strategies which help narrow the XML into workable/acceptable document structures.
The goal for this 30 minute discussion is to educate (and perhaps entertain) the audience about these techniques. We will attempt to do some live “demos” or examples of these, and if there are any technical difficulties there will also be pre-recorded videos. If there is time for audience interaction, it would be beneficial to learn about techniques or tooling that help in reliably outputting to XML from LLMs.
Navigating and Updating Trees of Maps and Arrays
Michael Kay
Describes the challenges of navigating and updating tree representations of JSON data that lack parent pointers and node identity, with reference to use cases presented at earlier conferences, and describing progress that has been made in solving these use cases with new XSLT and XQuery language features developed as part of the 4.0 standardisation effort.
JSONPath: an IETF Proposed Standard, with comparisons to XPath
Alan Painter
The Internet Engineering Task Force (IETF) have recently promoted to “Proposed Standard” the RFC9535 entitled “JSONPath: Query Expressions for JSON”. This new proposed standard cites in Section 1.2 that JSONPath is “inspired by XML’s XPath”. As the XML community has used different versions of XPath for querying XML for many years, extended more recently to querying JSON via XPath 3.1, a comparison of the relatively new JSONPath proposed standard and XPath can be of some utility.
Containerizing XML Build Tools to Facilitate CI/CD
Edward Porter
For nearly the past 15 years, Continuous Integration / Continuous Delivery (CI/CD) has been the mantra of software development. Countless hours have been spent in search of ever more robust automated build and testing pipelines. For years, that stack relied on carefully configured build environments on CI machines running Jenkins or some other automation software. Today, the software development community is moving increasingly to containerizing build environments and tooling to make builds more modular and less dependent on the configuration of the machines on which they run. The same hobbling effects of brittle builds and fickle tooling evidenced by differences between builds in a local dev environment versus pipeline builds in software development can be found in the XML stack. In this article, we explore the power and flexibility provided by containerizing XML build tooling for use by end users in their development work and in build pipelines to ease content development for subject matter experts and to produce modular, predictable build tooling for CI/CD pipelines regardless of platform.
QTI and InDesign
Mark Dunn
This paper describes a project to automate a process for generating student worksheets in print and digital formats from a single source. The source XML format is QTI 2.1. The requirement was to produce from this source an Adobe InDesign document from which we can export a print PDF suitable for publication. We describe some basic concepts of InDesign and QTI, and outline the proposed new process, the XSLT transformation design, and some of the particular challenges that were encountered.
Roundtrip LwDITA Document Editing with Petal
Alexandra von Criegern, Younes Bahloul and Adam Retter
The talk will focus on Petal, an Open Source in-browser text editor we are building that is designed specifically for LwDITA (Lightweight DITA) documents. We will cover its functionality, including how it utilises JDITA (JSON Lightweight DITA) and our ProseMirror extension to create a user-friendly WYSIWYG interface for editing XML documents directly within a Web Browser. We will discuss the workflow involved in round-trip editing of LwDITA through a process of transforming XDITA (XML Lightweight DITA) documents to JDITA documents, rendering JDITA with ProseMirror, and then serializing JDITA back to XDITA.
A publishing environment with exist-db
Juri Leino
This talk will showcase the publishing environment that powers history.state.gov and how the different packages and libraries developed over the last couple of years tie together to enable efficient deployments and fast development cycles and good response times at scale.
From setting up database servers from the ground up with ansible and xst.
Developing applications with automated releases and Docker builds and routing with Roaster. Editing contents in local editors and integrating data from external databases like Airtable and adding annotations with tei-publisher.
How version control is integrated and how data is synced for preview and publishing with tuttle and airlock.
We will also have a look at caching in the application, database and proxy layers.
XMQ/HTMQ – see XML and HTML in a new light
Fredrik Öhrström
XMQ is an alternative format for XML/HTML which is easier to read and write for non-markup use cases, such as data storage, configuration files and documents with layout. For these use cases the standard format for XML/HTML is not always ideal. For example there is unnecessary verbosity in using markup tags both for opening and closing tags and the whitespace
rules makes pretty printing hard and sometimes impossible.
XMQ solves these problems by using braces for the hierarchical structure and it simplifies whitespace handling by requiring all content whitespace to be quoted. Tags with simple content can be presented as key=value pairs where safe values need not be quoted at all. Quoted content has its incidental indentation removed which means that it can always be pretty printed. This makes XMQ easier to read and write and maps well to other key value based languages like JSON while maintaining full XML compatibility. XMQ can always be printed in a single line
compact form which often requires less bytes than the corresponding compact XML/HTML.
<custom-element> as no-JS browser applications engine
Sasha Firsov
<custom-element> is an open source project and POC for W3C proposal for Declarative Custom Element syntax.
It is aiming to cover the complete web application development needs without the JavaScript on browser side. With the HTML and XSLT fused syntax, it brings the pover of XML stack with XSLT, XPath, and XML DOM.
The full gentleman set includes the XSLT templating, exposing DOM and browser API data, and event handling. Which is sufficient for development of simple flows like forms submission, signup, or sing-in.
The primary incentive for <custom-element> was a Cyber Security concern of hosting multiple vendors code without much safeguards in current web stack: most of 3rd party functionality currently available only via JavaScript which shares all browser contect from user input to network traffic to any script landed on the page.
<custom-element> made in the way that context belong only to own instance and not sharing anything with other instances, hence alows to embed 3rd party apps without worrying about security.
As OSS it is free to everyone with business-friendly licence. Syngrafact.AI provides the support.
https://github.com/EPA-WG/custom-element
Why Adding Some CSS Isn’t Enough
Anne Rudolf
CSS is often understood as an easy styling language. But when it comes to large projects, structure is mandatory. Printed user manuals use various XML constructs, which are styled differently per customer, document type or even within the same document. But the more layout requirements there are for the same semantic construct, the more complex styling becomes.
To manage the amount of CSS for print and web, it seems obvious to consult concepts from web CSS like OOCSS, BEM or ITCSS. Although they focus on web interfaces, they point out the use of visual patterns. With them styling is detached from content and structure, and it becomes scalable.
This talk provides some ideas on how to structure CSS projects based on safety notices as an example.