Friday, February 14th
Preconference schedule on a separate page.
Saturday, February 15th
chaired by yours truly Mohamed Zergaoui (morning) and Jirka Kosek (afternoon)
9:00 | Registration desk opens |
9:30 | Opening and sponsors presentation |
9:55 | Distributed Extensibility: Finally Done Right? Robin Berjon (W3C) |
10:40 | The web needs “XML: The Good Parts” Robbert Broersma (Frameless) and Yolijn van der Kolk (Frameless) |
11:10 | Coffee break |
11:40 | Standards update: XPath/XQuery 3.0/3.1 Jonathan Robie (EMC) |
12:00 | Standards update: XSLT 3.0 Michael Kay (Saxonica) |
12:40 | In consideration of improvements to XProc Norman Walsh (MarkLogic Corporation) |
13:10 | Lunch |
14:40 | Streaming for the masses Abel Braaksma (Exselt) |
15:10 | Streamability in Saxon Michael Kay (Saxonica) |
15:40 | Standards update: ITS 2.0 Felix Sasaki (DFKI) |
16:00 | Coffee Break |
16:30 | XFormsUnit: the Framework to Test Them All Eric Van der Vlist (Dyomedea) |
17:00 | XSLT 3.0 Testbed Tony Graham (Mentea) |
17:30 | XML Schema Identity Constraints Revisited Anne Brüggemann-Klein (TU München), Mustapha Maalej (TU München) and Marouane Sayih (TU München) |
18:00 | Closing of the first day |
19:30 | Social dinner & Demo Jam |
Sunday, February 16th
chaired by yours truly Petr Cimprich (morning) and James Fuller (afternoon)
9:00 | Registration desk opens |
9:30 | Opening of the second day |
9:40 | Data and Documents, Together Again Charles Greer (MarkLogic) |
10:10 | Scientific Computing in the Open Web Platform Alex Milowski (University of Edinburgh) and Henry Thompson (University of Edinburgh) |
10:40 | RADL: RESTful API Description Language Jonathan Robie (EMC), Rémon Sinnema (EMC) and Erik Wilde (EMC) |
11:10 | Coffee break |
11:40 | XML Authoring On Mobile Devices George Bina (Syncro Soft) |
12:10 | A MathML Progress Report Autumn Cuellar (Design Science) |
12:40 | Finalising a (small) Standard John Lumley (JWL Research Ltd) |
13:10 | Lunch |
14:40 | Publishing in Style with XML Liam R E Quin (W3C) |
15:10 | Formatting from XML Tony Graham (Mentea) |
15:40 | Publishing Q&A panel |
16:10 | Coffee Break |
16:40 | ProXist – XProc Processes in eXist Ari Nordström (Condesign AB) |
17:10 | What you’d like to see happen (or not) in the Web’s next 25 years? audience driven session moderated by Robin Berjon (W3C) |
18:00 | Closing of the conference |
Session details
Distributed Extensibility: Finally Done Right?
Robin Berjon (W3C)
It has long been a frequent goal of markup — and often document technologies in general — to enable extensibility by arbitrary third parties. However, all attempts to date have fallen largely short of their promises.
XML Namespaces do enjoy a modicum of (much reviled) success in this space, but they have managed this at the cost of a severe limitation in their power, covering only distributed extensibility in naming. Hopes that this would provide a solid foundation atop which richly extensible documents would be built have not come to fruition.
A new contender has recently entered this fray: Web Components. Trying to learn from past mistakes they offer rich, if complex, extensibility functionality, notably in behaviour and styling.
This presentation will look at how Web Components work, what they offer (and have so far declined to offer) in terms of distributed extensibility, and show how they can be put to work to enable innovative behaviour in documents.
Robin Berjon is a freelance consultant carrying out research, prototyping, and standardisation in Web, mobile, and XML technologies. He has worked on both Web and XML standards for over a decade, and is currently trying to herd HTML5 to Recommendation as part of the W3C team. He lives in Paris, France, with his wife, two daughters, and a rather idiotic cat.
The web needs “XML: The Good Parts”
Robbert Broersma (Frameless) and Yolijn van der Kolk (Frameless)
Web development frameworks on the rise such as AngularJS and Ember now provide ‘two-way data bindings’ for form elements, programatically providing Model-View-Controller features that XForms declaratively expresses in markup and XPath queries.
A new web standard is being drafted while simultaneously being implemented in JavaScript: Web Components. It aims to provide templates for custom elements in HTML, switching between templates using CSS selectors and using JavaScript to provide scriptable markup and event bindings. Every front-end development congress has at least one talk about this upcoming standard.
These are features that XForms and XSLT standards already designed a long time ago. If these shared philosophies are growing more popular, then why aren’t these XML technologies themselves appealing to web developers?
Frameless is software implemented in JavaScript that aims to bring powerful features from XPath, XSLT and XForms together in existing browsers, and explores ways to combine the declarative real-time data bindings from XForms with the powerful templates of XSLT.
We will show what can be done to cut down on complexity, and how we can give developers more flexibility. We will argue why Web Components should not settle for less than the power of XSLT 2 templates, and how users of popular web frameworks are missing out on the good parts of the XML platform.
In consideration of improvements to XProc
Norman Walsh (MarkLogic Corporation)
XProc: An XML Pipeline Language has been gaining adoption steadily, if slowly. Even among its fans, the observation has been made that some aspects of the language frustrate new users. Recently, the XML Processing Model Working Group, the working group at the W3C responsible for the continued development of XProc, has drafted a new requirements document for V.next of the language.
The principle focus of V.next is usability improvements. These range from the relatively simple, obvious syntactic shortcuts, to the relatively audacious, removing entire language features determined to be more trouble than they are worth.
This paper reviews the current state of the art in XProc language design with an eye towards explaining and amplifying the efforts of the working group on the one hand, and on the other, encouraging members of the XML community to voice their concerns.
Norman Walsh is a Lead Engineer at MarkLogic Corporation where he helps to develop the world’s leading enterprise NoSQL database. Norm is also an active participant in a number of standards efforts worldwide: he is chair of the XML Processing Model Working Group at the W3C where he is also co-chair of the XML Core Working Group. At OASIS, he is chair of the DocBook Technical Committee.
With two decades of industry experience, Norm is well known for his work on DocBook and a wide range of open source projects. He is the author of DocBook: The Definitive Guide.
Streaming for the masses
Abel Braaksma (Exselt)
Streaming is often considered an elite technique that’s only understood and mastered by a happy few. However, streaming is everywhere nowadays, with twitter and news feeds, big data processing, log listeners, facebook, or any social media board, forums, streaming media like movies, music etc. Why should it be hard to process such streams? The answer: it is not. If you follow a few simple rules you can apply streaming to many common scenarios.
This paper explains how to make a good decision whether to use streaming or not. And once you need streaming, it introduces a methodology that is easy to memorize and master. For more complex and corner cases, it shows a flow-chart like model that can be followed when the easy model doesn’t deliver the expected results. Bottom line, it delivers XSLT-streaming to the masses.
Abel Braaksma is owner of Abrasoft and creator of the new streaming XSLT 3.0 processor Exselt. He has more than 15 years experience with XML and related technologies and is currently an Invited Expert of the XSLT and XPath working groups at W3C. He can be reached about anything XML, C#, Java or F# related at info@abrasoft.net or for the Exselt processor at info@exselt.net. For his current thoughts on technologies, you can visit his blog at Under My Hat.
Streamability in Saxon
Michael Kay (Saxonica)
Streaming is a major new feature of the XSLT 3.0 specification, currently a Last Call Working Draft. This paper discusses streaming as defined in the W3C specification, and as implemented in Saxon. Streaming refers to the ability to transform a document that is too big to fit in memory, which depends on transformation itself being in some sense linear, so that pieces of the output appear in the same order as the pieces of the input on which they depend. This constraint is reflected in the W3C specification by a set of streamability rules that determine statically whether a stylesheet is streamable or not.
This paper gives a tutorial introduction to the streamability rules and they way they are implemented in Saxon. It then does on to describe the implementation architecture for implementing streaming in the Saxon run-time, by means of push pipelines, and gives rationale for this choice of architecture.
XFormsUnit: the Framework to Test Them All
Eric Van der Vlist (Dyomedea)
Current practices to test XForms developments rely on generic web testing frameworks and expose implementation specific details which can change from version to version. XForms forms can be incredibly complex and they deserve a proper test framework allowing to define tests using XForms paradigms. This talk presents XFormsUnit, a native XForms test framework.
Eric is an independent consultant and trainer. His domain of expertise include Web development and XML technologies. He is the creator and main editor of XMLfr.org, the main site dedicated to XML technologies in French, the author of the O’Reilly animal books XML Schema and RELAX NG and has been involved in the ISO DSDL (http://dsdl.org) working group focused on XML schema languages. He his based in Paris and you can reach him by mail (vdv@dyomedea.com) or meet him in one of the many conferences where he presents his projects.
XSLT 3.0 Testbed
Tony Graham (Mentea)
https://github.com/MenteaXML/xslt3testbed is a public, medium-sized XSLT 3.0 project where people could try out new XSLT 3.0 features on the transformations to (X)HTML(5) and XSL-FO that are what we do most often and, along the way, maybe come up with new design patterns for doing transformations using the higher-order functions, partial function application, and other goodies that XSLT 3.0 gives us.
Tony Graham has been working with markup since 1991, with XML since 1996, and with XSLT/XSL-FO since 1998. He is Chair of the Print and Page Layout Community Group at the W3C and previously an invited expert on the W3C XML Print and Page Layout Working Group (XPPL) defining the XSL-FO specification, as well as an acknowledged expert in XSLT, developer of the open source xmlroff XSL formatter, a committer to both the XSpec and Juxy XSLT testing frameworks, the author of “Unicode: A Primer”, a member of the XML Guild, and a qualified trainer.
Tony’s career in XML and SGML spans Japan, USA, UK, and Ireland, working with data in English, Chinese, Japanese, and Korean, and with academic, automotive, publishing, software, and telecommunications applications. He has also spoken about XML, XSLT, XSL-FO, EPUB, and related technologies to clients and conferences in North America, Europe, and Australia.
XML Schema Identity Constraints Revisited
Anne Brüggemann-Klein (TU München), Mustapha Maalej (TU München) and Marouane Sayih (TU München)
In this paper, we attempt to explain clearly our reading of XML Schema’s identity constraint concepts. We illustrate our reading extensively with examples, in the style of a tutorial. We also illustrate usage styles and limitations of identity constraints in XML Schema. Finally, we demonstrate how the limitations that we have identified can be by-passed with assertions as introduced by XPath 2.0 and XML Schema 1.1.
Data and Documents, Together Again
Charles Greer (MarkLogic)
The practice of embedding RDF triples in XML documents proves a surprisingly useful paradigm for data stores that combine structured and unstructured data.
In this paper I consider well-known features of an XML document-oriented database, and mix those with RDF data and SPARQL queries. On the one hand, XML documents are well-suited for encoding human-readable text and markup. On the other hand, RDF is an) emergent de facto standard for structured, typed, and distributed data. These two worlds are conceptually quite distinct; RDF data has no inherent interaction with the concept of the document boundary. But it turns out that the document boundary can scope RDF access; the interaction between RDF data and their enclosing documents can help solve problems around structured and unstructured data together in the same database management system.
In this paper I explore a few aspects in which RDF and documents are complementary when used together. First, I will consider the hybridization of query and mixed-content search. Since we can now mix data and text content freely, the lines between search and query blur in favor of a kind of information retrieval based on both relevance and exactitude. Second, I’ll take a look different kinds of RDF-in-XML documents. Some examples of document-based RDF use cases include a simple (and naive) method for maintaining rule-based inference state machines, and data binding objects to XML within a greater RDF context.
Document databases are mature and provide many capabilities that are missing from native RDF triple stores. We can help people leverage structured data simply by overlaying that structured data on top of an XML document-oriented substrate, and at the same time have providing continuity to legacy applications already using documents. Storing RDF in XML document databases opens them up to a wide new range of capabilities, as the global indexing and querying of data in the XML database becomes more interconnected and randomly accessible when indexed as RDF.
Charles Greer is a software engineer at MarkLogic Corporation, currently working to promote and develop semantics solutions on top of an XML document database architecture. His background includes stints at enterprise architecture, IT management, geospatial database administration, and Slavic linguistics. The first XML spec captured his imagination and he’s had markup on the brain ever since. Oh, and accordions.
Scientific Computing in the Open Web Platform
Alex Milowski (University of Edinburgh) and Henry Thompson (University of Edinburgh)
Publishing and using scientific data on the Web is difficult; size and data formats thwarts its use within the browser. Yet, the Open Web Platform provides a basis for many forms of computing and communication and so we look to the principles of Web Architecture to help enable scientific data on the Web. Through a combination of these principles and the use of RDFa annotation technologies, we describe a methodology for publishing data and show how it can be computed upon within the Web browser as a platform for scientific computing.
RADL: RESTful API Description Language
Jonathan Robie, Rémon Sinnema and Erik Wilde
In a REST API, the server provides options to a client in the form of hypermedia links in documents, and the main thing a client needs to know is how to locate and use these links in order to use the API. The main job of a REST API description is to provide this information to the client in the context of media type descriptions. Unfortunately, most REST service description languages and design methodologies focus on other concerns instead.
RESTful API Description Language (RADL) is an XML vocabulary for describing Hypermedia-driven RESTful APIs. The APIs it describes may use any media type, in XML, JSON, HTML, or any other format. The structure of a RADL description is based on media types, including the documents associated with a media type, links found in these documents, and the interfaces associated with these links.
RADL can be used as a specification language or as run-time metadata to describe a service.
XML Authoring On Mobile Devices
George Bina (Syncro Soft)
Not too long ago XML-born content was not present in a mobile-friendly form on mobile devices. Now, many of the XML frameworks like DocBook, DITA and TEI provide output formats that are tuned to be used on mobile devices. These are either different electronic book formats (EPUB, Kindle) or different mobile-friendly web formats.
Many people find XML authoring difficult on computers, let alone mobile devices. However, due to the constantly increasing number of mobile devices, that made people create mobile-friendly output formats from XML documents, there is clearly a need to provide also direct access to authoring XML content on these devices.
I would like to explore the options for providing XML authoring on mobile devices and describe our current work and the technology choices we made to create an authoring solution for mobile devices. Trying to enable people to create XML documents on mobile devices is a very exciting, mainly because the user interaction is completely different on a mobile device: different screen resolutions, different interaction methods (touch, swipe, pinch), etc. See how we imagined XML authoring on an Android phone or on iPad! How about editing XML on a smart TV? Leverage speech recognition/dictation and handwriting recognition technologies that are available on mobile devices to enable completely new ways of interacting with XML documents!
George Bina is one of the founders of Syncro Soft, the company that develops oXygen XML Editor. He has more 15 years experience in working with XML and related technologies including XML related projects, oXygen XML Editor and participation in open source projects, the most notable being DITA-NG – a Relax NG implementation of DITA – and oNVDL – an open source implementation of the NVDL standard, project that is now merged into Jing.
A MathML Progress Report
Autumn Cuellar (Design Science)
In the early days of HTML, math was a heavy topic of conversation within the HTML Working Group. The World Wide Web, after all, was built by scientists for scientists, and math resides at the heart of science. Displaying math on the Web was a tricky problem, however. Math is not an image and should not be treated as an image. Math is text and should be an inherent part of the document along with the paragraph text in the document, but the special formatting required was beyond the capabilities of browsers at the time. The problem was more than the HTML WG was equipped to handle. Thus, the Math Working Group was formed to tackle the challenge of a math markup language not only for display of equations on the Web but for a standard format for mathematics to be used within any mathematical and scientific communication. The MathML 1.0 specification became a W3C Recommendation in 1998. This paper will discuss the progress of MathML since.
The MathML language has undergone two major revisions since the initial MathML 1.0 specification. The latest revision, MathML 3.0, was finalized in October 2010. For the latest version, the Math Working Group carefully considered the needs of various groups with a stake in math communication. For example, support for better control of automatic linebreaking/line wrapping was added for the publishing community, who wanted rendering engines to be able to automatically break an equation extending beyond a set column or page width. MathML 3.0 also includes improved features for specifying elementary math notation and new support for international math. Though no standard is ever really complete, MathML has reached maturity with the latest specification.
Equations are rarely standalone objects. MathML is most useful when used in conjunction with a doctype that is larger in scope, and lately the standard has been gaining steam as a worthwhile format for encoding mathematics within wider standards. On the data side, scientific markup languages such as CellML and Systems Biology Markup Language (SBML) rely on MathML to contain the mathematics of the stored models. On the document side, MathML has been adopted by a range of XML standards from DAISY and NIMAS on the accessibility front to the Journal Article Tag Set (JATS) for use in scientific journal articles to use in DITA, which is used primarily for technical documentation. But perhaps the most significant milestone for MathML has been its recent inclusion in the HTML5 and EPUB 3 standards.
Now that MathML is nearly ubiquitous as a standard, what about tool support? Support for the MathML standards can be found in a range of applications, including authoring systems, computer algebra and other scientific computation systems, and reading systems. Nevertheless, a couple of challenges remain in this area. One is that where most want to see equations is in their browser and ebook systems, but support for MathML is lagging in both browsers and EPUB e-readers. One reason for this is that the makers of these systems can now depend on MathJax, an open-source Javascript library for rendering MathML in browsers. MathJax is a useful short-term solution, but it is insufficient for a number of reasons. The other remaining challenge is that conversion of documents in legacy formats can be difficult.
MathML has come a long way since its early days. The language has been steadily evolving over the past 15 years and has reached a healthy maturity in its latest version. The wider standards communities have come to recognize the value that MathML adds as a means of communicating mathematical and scientific information and have responded by including MathML where needed. The next step in the evolution of MathML is the continued development of tool support, especially native rendering in browsers and ereading systems and conversion tools for legacy formats.
Finalising a (small) Standard
John Lumley (JWL Research Ltd)
This paper discusses issues and lessons that arose during the finalisation of a standard (library) for XSLT/XPath/XQuery extension functions to manipulate binary data. This process took place during 2013 in the EXPath community, through shared (mailing-list) commenting, specification redrafting, implementation experimentation and test suite development. The purpose, form and specification of the library (which isn’t technically difficult) are described briefly. Lessons and suggestions arising from the development are presented in four broad categories: establishing policies, concurrent implementation and application, using tools and declarative approaches, and pragmatic issues. None of these lessons are new, but bear reinforcement.
Publishing in Style with XML
Liam R E Quin (W3C)
This paper reviews the status of CSS for producing books, both in print and on screen, discusses W3C strategy and CSS Working Group practice for moving CSS forward, and indicates some major areas of CSS strengths and weaknesses compared to XSL-FO.
Liam Quin has been working with digital typography, descriptive markup and electronic representation of texts and books since the early 1980s, has been with the W3C since 2001, styles himself Mrs XML, and believes shoes damage the mind.
Formatting from XML
Tony Graham (Mentea)
Formatting from XML is in freefall. On one hand, XSL-FO standardisation quietly died even as XSL-FO usage is on the increase, while on the other hand, CSS is moving to standardise properties for paginated media yet its pagination spec has been forked and Liam Quin, XML Activity Lead at W3C, says “I hope that CSS catches up with XSL-FO over the next two or three years.” Recently, the W3C also created a Digital Publishing Interest Group that, starting from a wide-ranging mission as “a forum for experts in the digital publishing ecosystem of electronic journals, magazines, news, or book publishing (authors, creators, publishers, news organizations, booksellers, accessibility and internationalization specialists, etc.)… to align the existing formats and technologies (e.g., for electronic books) with those used by the Open Web Platform”, has narrowed its formatting focus to books and is “currently focused on getting more publishing companies to join”. So if you are not a publishing company, you are not looking to represent your data in HTML5, or you need more than CSS or XSL-FO can currently provide, then you are out of luck, or out of standards-based solutions, for the immediate future.
Tony Graham has been working with markup since 1991, with XML since 1996, and with XSLT/XSL-FO since 1998. He is Chair of the Print and Page Layout Community Group at the W3C and previously an invited expert on the W3C XML Print and Page Layout Working Group (XPPL) defining the XSL-FO specification, as well as an acknowledged expert in XSLT, developer of the open source xmlroff XSL formatter, a committer to both the XSpec and Juxy XSLT testing frameworks, the author of “Unicode: A Primer”, a member of the XML Guild, and a qualified trainer.
Tony’s career in XML and SGML spans Japan, USA, UK, and Ireland, working with data in English, Chinese, Japanese, and Korean, and with academic, automotive, publishing, software, and telecommunications applications. He has also spoken about XML, XSLT, XSL-FO, EPUB, and related technologies to clients and conferences in North America, Europe, and Australia.
ProXist – XProc Processes in eXist
Ari Nordström (Condesign AB)
ProX is an abstraction layer around XProc pipelines, an XML-based blueprint that lists any processes built around XProc pipelines in a system, including the pipelines themselves, any input they might accept (XSLT, schemas, etc), as well as any configuration options the pipelines accept when run. When narrowed down to an instance, ProX describes a specific process with a specific pipeline and a specific configuration. For example, a generic print publishing process might allow choosing between several related pipelines that can use different XSLTs configured with different input parameters and other options, which need to be narrowed down to an instance with all of the choices made.
The ProX instance XML can then be used to generate a script that runs the selected process, configuring it and the resources used with any runtime values.
ProXist is a ProX implementation for eXist, run using a wrapper XQuery and accompanying pipelines. The wrapper preprocesses its input and presents the ProX blueprint to a user in an XForm, allowing the user to make choices and narrow the ProX blueprint down to an instance that, when saved, is used to generate an XQuery that runs the child pipeline represented by the selected process and configuration.
Ari Nordström is the resident XML guy at Condesign AB in Göteborg, Sweden. His information structures and solutions are used by Volvo Cars, Ericsson, and many others. His favourite XML specification remains XLink so quite a few of his frequent talks and presentations on XML focus on linking and various aspects of reuse.
Ari spends some of his spare time projecting films at the Draken Cinema in Göteborg, which should explain why he wants to automate cinemas using XML. He now realises it’s too late, however.