Schedule for Friday
chaired by yours truly Mohamed Zergaoui (morning) and Jirka Kosek (afternoon)
9:00 | Registration desk opens |
9:30 | Opening and sponsors presentation |
9:40 | A case study of committee-based semantic model development of XSD and JSON schemas Ken Holman |
10:10 | X-definition 4.1: XML, JSON, YAML and XON Vaclav Trojan and Tomáš Šmíd |
10:40 | The Pilot Implementation of ixml Steven Pemberton |
11:10 | Coffee break |
11:40 | Expression Elaboration Michael Kay |
12:10 | A Benchmark Collection of Deterministic Automata for XPath Queries Antonio Al Serhali and Joachim Niehren |
12:40 | Use the Markup, Stupid! Ari Nordström |
13:10 | Lunch |
14:30 | Opening of afternoon sessions |
14:40 | Schematron State of the Union Tony Graham, David Maus, Andrew Sales and Erik Siegel |
15:10 | XSL-FO/CSS Comparison Tony Graham |
15:40 | Coffee Break |
16:10 | Success with XSD as a DSL for Software Video Generation Dave Gullo and Vít Janota |
16:40 | Structure! You get more than you see Cerstin Mahlow |
17:10 | Closing of the day |
19:00 | Social dinner & DemoJam |
Session details
A case study of committee-based semantic model development of XSD and JSON schemas
Ken Holman
This paper is a case study of the Organization for the Advancement of Structured Information Standards (OASIS) Universal Business Language (UBL) committee following the Open-edi approach separating static semantic information design from syntactic data constraint expressions. The OASIS UBL committee is over 20 years old now. OASIS UBL ISO/IEC 19845 XML is used around the world in many business document interchange networks and environments. In UBL 2.3 business concepts govern 91 separate document types as onion-skins around a common core library of over 4000 information items.
» Read paper in proceedings
» Slides
X-definition 4.1: XML, JSON, YAML and XON
Václav Trojan and Tomáš Šmíd
X-definition 4.1 has been extended to work with JSON ot YAML data in addition to supporting XML data (there are also supported Properites or Windows INI data forms). The X-definition Object Notation (XON) format has been proposed as a generic format for writing supported data. The XON format is an extension of JSON. While JSON only knows string, number, boolean, and null values, XON format allows you to specify the date, time, email address, IP address, number type, number of all other types of values in the that X-definition supports
» Read paper in proceedings
» Slides
A Pilot Implementation of ixml
Steven Pemberton
Invisible XML (ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content. By the time of the publication of this paper, it is anticipated that the official version of ixml will have been announced by the ixml working group.
During the development of ixml, a pilot implementation was built in order to support decisions on the development of the notation, and to provide examples of the output ixml produces.
This paper describes the implementation, decisions taken, and how certain processes work, such as serialisation, and dealing with ambiguity, and ends by discussing future work to be done.
» Read paper in proceedings
» Slides
Expression Elaboration
Michael Kay
This paper describes an approach to evaluation of expression-based languages such as XSLT, XQuery, and XPath, in which nodes on the expression tree output by the language parser are converted to lambda expressions in Java, Javascript, or C#, with the aim of doing as much work as possible once only, in advance of the actual expression evaluation.
» Read paper in proceedings
» Slides
A Benchmark Collection of Deterministic Automata for XPath Queries
Antonio Al Serhali and Joachim Niehren
We provide a benchmark collection of deterministic automata for regular XPath queries. For this, we select the subcollection of forward navigational XPath queries from a corpus that Lick and Schmitz extracted from real-world XSLT and XQuery programs, compile them to stepwise hedge automata (SHAs), and determinize them. Large blowups by automata determinization are avoided by using schema-based determinization. The schema captures the XML data model and the fact that any answer of a path query must return a single node. Our collection also provides deterministic nested word automata that we obtain by compilation from deterministic SHAs.
» Read paper in proceedings
» Slides
Use the Markup, Stupid!
Ari Nordström
The XML technology stack has been around for more than 20 years and is, by all accounts, mature. Why is it that some insist on non-XML solutions for XML problems? Use the markup, stupid!
» Read paper in proceedings
» Slides
Schematron State of the Union
Tony Graham, David Maus, Andrew Sales and Erik Siegel
This is a discussion of the current state and possible future of Schematron and the ISO Standard for Schematron presented by some of the most active members of the Schematron community, including Andrew Sales, editor of the ISO Standard.
Schematron is at a crossroads: ISO Schematron 2020 has gone behind a paywall after previous versions were free, plus ISO has no current plans to further update the Schematron standard, yet Schematron is widely used while at the same time there is a growing list of aspects of Schematron that could be clarified or enhanced in a future version of the standard. The next opportunity to influence ISO to restart the Schematron work is fast approaching. This session will look at how we got here and what are the possible next steps.
» Slides
XSL-FO/CSS Comparison
Tony Graham
Comparing XSL-FO and CSS formatting is not straightforward. XSL implementations are not standing still: XSL formatters are still incrementally improving even though the XSL Recommendation has not been updated since 2006. CSS is definitely not standing still, although some of the modules most relevant to paged media are advancing slowly, if at all, and some paged media features have been removed in more recent Working Drafts.
This is a high-level view of the differences and similarities between XSL-FO and CSS, based on an extensive new analysis by Antenna House that itself is formatted identically using both XSL-FO and CSS. It also covers some of the features of how the two versions are produced.
» Read paper in proceedings
» Slides
Success with XSD as a DSL for Software Video Generation
Dave Gullo and Vít Janota
Videate solves the “video debt” challenge often seen in software instructional videos caused by the SDLC. With each new software release, changes (small and large) occur, which invalidate some or all of the videos in a library. Using browser automation, quality text-to-speech, and human behavior generation, Videate is able to make videos in a fraction of the time required by traditional means. At the heart of the Videate platform is Spiel.XSD, which serves as a Domain Specific Language for our video rendering engine. Let’s dive in and see how XSD and XSLT allow Videate to take in a number of XML-based document formats, and turn them into high quality videos.
» Slides
Structure! You get more than you see
Cerstin Mahlow
In the 1990s, the focus on the printed page as the final product of writing with WYSIWYG tools clashed first with the development of the Web and a decade later with the advent of mobile devices. Both developments enabled—and required— new types of documents and thus demanded new tools and processes for producing these documents. In the 2010s, the emphasis on writing experience, personalization of tools, and the growing diversity of input devices, methods, and displays is the main reason for the design and development of “new writing tools.” Their functionalities are often working implementations of methods and concepts originally described and developed in the 1960s and 1970s that seem to have failed due to the limitations of computers at that time. Dedicated research on writing tools stopped in the late 1980s, once universities and companies had decided what to purchase and Microsoft Word had achieved monopoly status in the consumer market. The shift of academic writing to include dynamic aspects of “text,” e.g., code (snippets), data plots, and other visualizations clearly demands other tools for text production than traditional word processors. When the printed page no longer is the desired final product, content and format can be addressed explicitely and separately, thus emphasizing the structure of texts rather than the structure of documents.