Syntax Representations of the DDI-CDI Model

Includes published and exploratory approaches

Warning

Draft, needs revision


Compiled by:

Joachim Wackerow and Deirdre Lungley

Overview

  • Approaches that use the DDI-CDI UML model (Canonical XMI) as a basis

  • Approaches that use the DDI-CDI XML Schema as a basis, which is generated from the UML model. Often called XML Schema based data binding. The generators create usually program libraries.

Overview on Generated Encodings

UML Model as Basis

XML Schema as Basis

  • XML Schema

  • Ontology (Turtle)

  • JSON-LD

  • C#

  • C++

  • Java

  • JSON Schema (issues)

  • Python

  • R

  • Typescript

DDI-CDI

  • DDI-CDI is defined as a UML model

  • A restricted set of UML is used, UML Class Model Interoperable Subset (UCMIS)

  • The model is expressed as Canonical XMI

UCMIS

  • UML Class Model Interoperable Subset (UCMIS), a subset of UML class diagram items, is intended for data modeling.

  • It focuses on core items that are familiar from object-oriented programming.

  • The subset focuses on items that describe classes, describe their relationships to each other and their attributes.

  • UCMIS supports model interoperability, especially if it is in the form of Canonical XMI.

Canonical XMI

  • The XML Metadata Interchange (XMI) is an Object Management Group (OMG) standard for exchanging metadata information via Extensible Markup Language (XML).

  • Canonical XMI (also OMG) is a special restricted format of XMI that minimizes variability and provides predictable identification and ordering. It supports interoperability.

XML Schema

The XML Schema is directly based on the UML model. It is generated with the UCMIS Model-to-Text tool.

  • Entire XML Schema in the GitHub repository ddi-cdi.xsd

  • Fragments per class and data types are available in the field-level documentation of DDI-CDI. See the section ‘encodings’ on the bottom of the example ‘InstanceVariable’.

RDF

The encodings ontology as Turtle and JSON-LD are directly based on the UML model. They are generated with the UCMIS Model-to-Text tool.

Ontology as Turtle

  • Main file ddi-cdi.onto.ttl

  • Fragments per class and data types are available in the field-level documentation of DDI-CDI. See the section ‘encodings’ on the bottom of the example InstanceVariable.

Generator Code for Ontology as Turtle

Acceleo code for generating Turtle on the basis of UML classes.

classOnto.mtl
[comment encoding = UTF-8 /]
[module classOnto('http://www.eclipse.org/uml2/5.0.0/Types', 'http://www.eclipse.org/uml2/5.0.0/UML', 'http://www.eclipse.org/uml2/5.0.0/UML/Profile/Standard')]

[import ucmis::m2t::query::modelQuery /]
[import ucmis::m2t::target::rdf::commonRdf /]
[import ucmis::m2t::target::onto::associationOnto /]
[import ucmis::m2t::target::onto::commonOnto /]

[template public classOnto(aClass : Class)]
# class [aClass.name/]
# based on the UML class [aClass.qualifiedName/]
[aClass.iri()/]
  a rdfs:Class, owl:Class, ucmis:Class;
  rdfs:label "[aClass.name/]";
  [rdfs_comment(aClass.ownedComment)/]
  [aClass.superClasses()/]
.

[for ( anAttribute : Property | aClass.e_attributes() ) ]
[anAttribute.attributeOnto()/]
[/for]

[for ( anAssociation : Association | aClass.eInverse(Association) ) ?(anAssociation.source() = aClass) ]
[anAssociation.associationOnto(aClass)/]
[/for]
[/template]
Source files at UCMIS M2T repository.

JSON-LD

  • Main file ddi-cdi.jsonld

  • Fragments per class and data types are available in the field-level documentation of DDI-CDI. See the section ‘encodings’ on the bottom of the example InstanceVariable.

C#

Based on the XML Schema.

The Microsoft XML Schema Definition (Xsd.exe) tool generates XML schema or common language runtime classes from XDR, XML, and XSD files, or from classes in a runtime assembly.

Windows Batch File

run_xsd.cmd
set MICROSOFT_XSD=C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\xsd
set XSD_HOME=..\..\model_based\xsd_variants

"%MICROSOFT_XSD%" "%XSD_HOME%\ddi-cdi_42_noXsdTypeInName.xsd" /classes /language:CS
"%MICROSOFT_XSD%" "%XSD_HOME%\ddi-cdi_43_noXsdTypeInName.xsd" /classes /language:CS
"%MICROSOFT_XSD%" "%XSD_HOME%\ddi-cdi_44_noXsdTypeInName.xsd" /classes /language:CS
"%MICROSOFT_XSD%" "%XSD_HOME%\ddi-cdi_45_noXsdTypeInName.xsd" /classes /language:CS

set MICROSOFT_XSD=
set XSD_HOME=

Generated File

A code file is generated for C#.

Fragment of ddi-cdi_44_noXsdTypeInName.cs
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.8.3928.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://ddialliance.org/Specification/DDI-CDI/1.0/XMLSchema/")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://ddialliance.org/Specification/DDI-CDI/1.0/XMLSchema/", IsNullable=false)]
public partial class DDICDIModels {
    
    private object[] itemsField;
    
    private Wrapper[] wrapperField;
    
    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("Activity", typeof(Activity))]
    [System.Xml.Serialization.XmlElementAttribute("Agent", typeof(Agent))]
    [System.Xml.Serialization.XmlElementAttribute("AgentListing", typeof(AgentListing))]
    [System.Xml.Serialization.XmlElementAttribute("AgentPosition", typeof(AgentPosition))]
    [System.Xml.Serialization.XmlElementAttribute("AgentRelationship", typeof(AgentRelationship))]
    [System.Xml.Serialization.XmlElementAttribute("AgentStructure", typeof(AgentStructure))]
    [System.Xml.Serialization.XmlElementAttribute("AllenIntervalAlgebra", typeof(AllenIntervalAlgebra))]
    [System.Xml.Serialization.XmlElementAttribute("AttributeComponent", typeof(AttributeComponent))]
    [System.Xml.Serialization.XmlElementAttribute("AuthorizationSource", typeof(AuthorizationSource))]
    [System.Xml.Serialization.XmlElementAttribute("Category", typeof(Category))]
    [System.Xml.Serialization.XmlElementAttribute("CategoryPosition", typeof(CategoryPosition))]
    [System.Xml.Serialization.XmlElementAttribute("CategoryRelationStructure", typeof(CategoryRelationStructure))]
    [System.Xml.Serialization.XmlElementAttribute("CategoryRelationship", typeof(CategoryRelationship))]
    [System.Xml.Serialization.XmlElementAttribute("CategorySet", typeof(CategorySet))]
    [System.Xml.Serialization.XmlElementAttribute("CategoryStatistic", typeof(CategoryStatistic))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationFamily", typeof(ClassificationFamily))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationIndex", typeof(ClassificationIndex))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationIndexEntry", typeof(ClassificationIndexEntry))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationIndexEntryPosition", typeof(ClassificationIndexEntryPosition))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationItem", typeof(ClassificationItem))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationItemPosition", typeof(ClassificationItemPosition))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationItemRelationship", typeof(ClassificationItemRelationship))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationItemStructure", typeof(ClassificationItemStructure))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationPosition", typeof(ClassificationPosition))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationSeries", typeof(ClassificationSeries))]
    [System.Xml.Serialization.XmlElementAttribute("ClassificationSeriesStructure", typeof(ClassificationSeriesStructure))]
    [System.Xml.Serialization.XmlElementAttribute("Code", typeof(Code))]
    [System.Xml.Serialization.XmlElementAttribute("CodeList", typeof(CodeList))]
    [System.Xml.Serialization.XmlElementAttribute("CodeListStructure", typeof(CodeListStructure))]
    [System.Xml.Serialization.XmlElementAttribute("CodePosition", typeof(CodePosition))]
    [System.Xml.Serialization.XmlElementAttribute("CodeRelationship", typeof(CodeRelationship))]
    [System.Xml.Serialization.XmlElementAttribute("ComponentPosition", typeof(ComponentPosition))]
    [System.Xml.Serialization.XmlElementAttribute("Concept", typeof(Concept))]

Source file ddi-cdi.cs

C++

Based on the XML Schema.

CodeSynthesis XSD is an open-source, cross-platform W3C XML Schema to C++ data binding compiler.

Windows Batch File

run_xsd.cmd
set XSD_HOME=..\..\model_based\xsd_variants
set XSD_EXE_FOLDER=C:\Programs\c++xsd\xsd-4.2.0-x86_64-windows10\bin
rem set XSD_EXE_FOLDER=E:\DDI\CDI\SyntaxRepresentation\encoding\c++\xsd-4.2.0-x86_64-windows10\bin

%XSD_EXE_FOLDER%\xsd.exe cxx-tree --root-element DDICDIModels %XSD_HOME%\ddi-cdi_42_noXsdTypeInName.xsd
rem %XSD_EXE_FOLDER%\xsd.exe cxx-tree --root-element DDICDIModels %XSD_HOME%\ddi-cdi_43_noXsdTypeInName.xsd
rem %XSD_EXE_FOLDER%\xsd.exe cxx-tree --root-element DDICDIModels %XSD_HOME%\ddi-cdi_44_noXsdTypeInName.xsd
rem %XSD_EXE_FOLDER%\xsd.exe cxx-tree --root-element DDICDIModels %XSD_HOME%\ddi-cdi_45_noXsdTypeInName.xsd

set XSD_HOME=
set XSD_EXE_FOLDER=

Source file run_xsd.cmd

Generated Files

A code file and a header file are generated for C++.

C++ Code File Fragment

Fragment of ddi-cdi_44_noXsdTypeInName.cxx
#include <xsd/cxx/pre.hxx>

#include "ddi-cdi_44_noXsdTypeInName.hxx"

namespace XMLSchema
{
  // DDICDIModels
  //

  const DDICDIModels::Activity_sequence& DDICDIModels::
  Activity () const
  {
    return this->Activity_;
  }

  DDICDIModels::Activity_sequence& DDICDIModels::
  Activity ()
  {
    return this->Activity_;
  }

  void DDICDIModels::
  Activity (const Activity_sequence& s)
  {
    this->Activity_ = s;
  }

Source file ddi-cdi.cxx

C++ Header File Fragment

Fragment of ddi-cdi_44_noXsdTypeInName.hxx
    // AuthorizationSource
    //
    typedef ::XMLSchema::AuthorizationSource AuthorizationSource_type;
    typedef ::xsd::cxx::tree::sequence< AuthorizationSource_type > AuthorizationSource_sequence;
    typedef AuthorizationSource_sequence::iterator AuthorizationSource_iterator;
    typedef AuthorizationSource_sequence::const_iterator AuthorizationSource_const_iterator;
    typedef ::xsd::cxx::tree::traits< AuthorizationSource_type, char > AuthorizationSource_traits;

    const AuthorizationSource_sequence&
    AuthorizationSource () const;

    AuthorizationSource_sequence&
    AuthorizationSource ();

    void
    AuthorizationSource (const AuthorizationSource_sequence& s);

    // Category
    //
    typedef ::XMLSchema::Category Category_type;
    typedef ::xsd::cxx::tree::sequence< Category_type > Category_sequence;

Source file ddi-cdi.hxx

Java

Approaches are based on the XML Schema:
JAXB and XMLBeans

JAXB Generation - Windows Batch File

run_xjc.cmd
set JAXB_HOME=C:\Programs\jaxb\jaxb-ri-4.0.3\jaxb-ri
set XSD_HOME=..\..\model_based\xsd_variants

IF EXIST jaxb_42 RMDIR /S /Q jaxb_42
IF NOT EXIST jaxb_42 MKDIR jaxb_42
call %JAXB_HOME%\bin\xjc.bat -d jaxb_42 "%XSD_HOME%\ddi-cdi_42_noXsdTypeInName.xsd"

IF EXIST jaxb_43 RMDIR /S /Q jaxb_43
IF NOT EXIST jaxb_43 MKDIR jaxb_43
call %JAXB_HOME%\bin\xjc.bat -d jaxb_43 "%XSD_HOME%\ddi-cdi_43_noXsdTypeInName.xsd"

IF EXIST jaxb_44 RMDIR /S /Q jaxb_44
IF NOT EXIST jaxb_44 MKDIR jaxb_44
call %JAXB_HOME%\bin\xjc.bat -d jaxb_44 "%XSD_HOME%\ddi-cdi_44_noXsdTypeInName.xsd"

IF EXIST jaxb_45 RMDIR /S /Q jaxb_45
IF NOT EXIST jaxb_45 MKDIR jaxb_45
call %JAXB_HOME%\bin\xjc.bat -d jaxb_45 "%XSD_HOME%\ddi-cdi_45_noXsdTypeInName.xsd"

set JAXB_HOME=
set XSD_HOME=

Generation with XMLBeans - Ant Build File

build.xml
<?xml version="1.0" encoding="UTF-8"?>
<project name="ddi-cdi-xmlbeans" default="create-xmlbeans">
	<!-- xmlbeans location -->
	<property name="xmlbeans.lib.dir" location="C:/Programs/xmlbeans-5.2.0/lib"/>
	<!-- Set the classpath -->
	<path id="build.classpath">
		<fileset dir="${xmlbeans.lib.dir}">
			<include name="*.jar"/>
		</fileset>
	</path>
	<taskdef name="xmlbeans" classname="org.apache.xmlbeans.impl.tool.XMLBean" classpath="${xmlbeans.lib.dir}/xmlbeans-5.2.0.jar" classpathref="build.classpath"/>
	<!-- ddi-cdi schema -->
	<property name="xsd_42" location="../../model_based/xsd_variants/ddi-cdi_42_noXsdTypeInName.xsd"/>
	<property name="xsd_43" location="../../model_based/xsd_variants/ddi-cdi_43_noXsdTypeInName.xsd"/>
	<property name="xsd_44" location="../../model_based/xsd_variants/ddi-cdi_44_noXsdTypeInName.xsd"/>
	<property name="xsd_45" location="../../model_based/xsd_variants/ddi-cdi_45_noXsdTypeInName.xsd"/>
	<target name="create-xmlbeans">
		<xmlbeans schema="${xsd_42}" download="true" classpath="ddi-cdi-xmlbeans.jar" srconly="true" srcgendir="xmlbeans_42"/>
		<xmlbeans schema="${xsd_43}" download="true" classpath="ddi-cdi-xmlbeans.jar" srconly="true" srcgendir="xmlbeans_43"/>
		<xmlbeans schema="${xsd_44}" download="true" classpath="ddi-cdi-xmlbeans.jar" srconly="true" srcgendir="xmlbeans_44"/>
		<xmlbeans schema="${xsd_45}" download="true" classpath="ddi-cdi-xmlbeans.jar" srconly="true" srcgendir="xmlbeans_45"/>
	</target>
</project>

Evaluation

JAXB seems to be the better choice. XMLBeans doesn’t … (to be filled in)

JSON Schema

Approach is based on the XML Schema:
xsd2jsonschema - A pure JavaScript library for translating complex XML Schemas into JSON Schemas.

Other known approaches are provided by the commercial products Altova XML Spy and Oxygen XML Editor.

JSON Schema Generation with xsd2jsonschema - Windows Batch File

run.cmd
node ddi-cdi.js > ddi-cdi.schema.json

xs:choice Doesn’t Seem to be Implemented

generation.log
Error: choice array needs to be implemented!!
    at ConverterDraft07.handleChoiceArray (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\converterDraft04.js:328:9)
    at ConverterDraft07.choice (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\converterDraft04.js:357:16)
    at ConverterDraft07.process (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\converterDraft04.js:134:33)
    at BaseConversionVisitor.visit (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\visitors\visitor.js:60:26)
    at DepthFirstTraversal.walk (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\depthFirstTraversal.js:47:34)
    at DepthFirstTraversal.walk (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\depthFirstTraversal.js:52:12)
    at DepthFirstTraversal.walk (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\depthFirstTraversal.js:52:12)
    at DepthFirstTraversal.walk (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\depthFirstTraversal.js:52:12)
    at DepthFirstTraversal.traverse (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\depthFirstTraversal.js:78:9)
    at Xsd2JsonSchema.processSchema (E:\Git\ddi-cdi_encoding\encoding\xsd_based\json-schema\node_modules\xsd2jsonschema\src\xsd2JsonSchema.js:264:37)

Python

Approach is based on the XML Schema.

generateDS - Generate Data Structures from XML Schema

Installation
pip install generateDS

Windows Batch File for the Generation

xsd2py.cmd
set XSD_HOME=..\..\model_based\xsd_variants

process_includes.py %XSD_HOME%\ddi-cdi_42_noXsdTypeInName.xsd ddi-cdi_42_noXsdTypeInName_complete.xsd
generateDS.py --export=etree --no-warnings -o ddi-cdi_42.py ddi-cdi_42_noXsdTypeInName_complete.xsd

process_includes.py %XSD_HOME%\ddi-cdi_43_noXsdTypeInName.xsd ddi-cdi_43_noXsdTypeInName_complete.xsd
generateDS.py --export=etree --no-warnings -o ddi-cdi_43.py ddi-cdi_43_noXsdTypeInName_complete.xsd

process_includes.py %XSD_HOME%\ddi-cdi_44_noXsdTypeInName.xsd ddi-cdi_44_noXsdTypeInName_complete.xsd
generateDS.py --export=etree --no-warnings -o ddi-cdi_44.py ddi-cdi_44_noXsdTypeInName_complete.xsd

process_includes.py %XSD_HOME%\ddi-cdi_45_noXsdTypeInName.xsd ddi-cdi_45_noXsdTypeInName_complete.xsd
generateDS.py --export=etree --no-warnings -o ddi-cdi_45.py ddi-cdi_45_noXsdTypeInName_complete.xsd

set XSD_HOME=

Python

This prototype approach is based on the UML model:

TO DO

R Language

Approach is based on a XML Schema.

R XMLSchema from Duncan Temple Lang.

A package that reads XML schema into an R representation and can perform some operations on the resulting information to generate class definitions and code to read documents using this schema.

The R representation uses S4 classes.

Installation of XMLSchema in R

install.packages("remotes")
remotes::install_github("omegahat/XMLSchema")
install.packages("sloop")

Test Run

An internal representation of the XML schema is created as S4 classes.

options(echo=TRUE)

library(sloop)
library(XML)
library(XMLSchema)

args <- commandArgs(TRUE)
print(args[1])

print('--------------------------------------------------')
cdi = readSchema(args[1], createConverters = TRUE, verbose = TRUE)
# cdi = readSchema("E:/Git/ddi-cdi/build/encoding/xml-schema/ddi-cdi.xsd", createConverters = TRUE)

print('--------------------------------------------------')
is(cdi)

print('--------------------------------------------------')
otype(cdi)

print('--------------------------------------------------')
class(cdi)

print('--------------------------------------------------')
names(cdi)

print('--------------------------------------------------')
sapply(cdi, length)

print('--------------------------------------------------')
sapply(cdi, class)

print('--------------------------------------------------')
sapply(cdi, names)

print('--------------------------------------------------')
showMethods( classes=class(cdi) )

print('--------------------------------------------------')
print(cdi)

print('--------------------------------------------------')
saveRDS(cdi, 'cdi.rds')
rm(cdi)
exists("cdi")
cdi <- readRDS('cdi.rds')
otype(cdi)
class(cdi)
names(cdi)

# serialize S4 classes, not sure if this is right
sink("cdi.R")
sapply(cdi, expandS4)
sink()

Documentation of XMLSchema

In R:

help(package="XMLSchema")

Once a local web server has been started, the documentation is displayed in the web browser.

R Language

This prototype approach is based on the UML model:

TO DO

Typescript

Approach is based on the XML Schema:
cxsd is a streaming XSD parser and XML parser generator for Node.js and TypeScript.
  • The input XML Schema must be provided by a web server.

  • cxsd generates the files XMLSchema.d.ts and XMLSchema.js.

Typescript Generation with cxsd - Windows Batch File

run_cxsd.cmd
set XSD_HOME=..\..\model_based\xsd_variants

rem from https://github.com/charto/cxsd
rem   The first line just sets up NPM to allow calling cxsd without installing it globally. It also works on Windows if you omit the single quotes (').

rem echo { "scripts": { "cxsd": "cxsd" } } > package.json
rem npm install cxsd

start python -m http.server --directory %XSD_HOME%
TIMEOUT /T 10

IF EXIST xmlns RMDIR /S /Q xmlns
IF EXIST 42 RMDIR /S /Q 42
IF NOT EXIST 42 MKDIR 42
call npm run cxsd http://localhost:8000/ddi-cdi_42_noXsdTypeInName.xsd > 42.log 2>&1
move xmlns 42

IF EXIST xmlns RMDIR /S /Q xmlns
IF EXIST 43 RMDIR /S /Q 43
IF NOT EXIST 43 MKDIR 43
call npm run cxsd http://localhost:8000/ddi-cdi_43_noXsdTypeInName.xsd > 43.log 2>&1
move xmlns 43

IF EXIST xmlns RMDIR /S /Q xmlns
IF EXIST 44 RMDIR /S /Q 44
IF NOT EXIST 44 MKDIR 44
call npm run cxsd http://localhost:8000/ddi-cdi_44_noXsdTypeInName.xsd > 44.log 2>&1
move xmlns 44

IF EXIST xmlns RMDIR /S /Q xmlns
IF EXIST 45 RMDIR /S /Q 45
IF NOT EXIST 45 MKDIR 45
call npm run cxsd http://localhost:8000/ddi-cdi_45_noXsdTypeInName.xsd > 45.log 2>&1
move xmlns 45

set XSD_HOME=

Open Issues

  • ShEx / SHACL generation on the basis of the UML model, would be great for validation

  • Would following encodings make sense?

    • JSON Schema (issues with XSD-based generator), would be great for validation

    • GraphQL (tests with generateDS and StarUML should be documented)

  • Pending tests with the various approaches

  • Does the definition of a set of common high-lever functions (get, set, …) make sense across various encodings?

  • Review of UML Class Model Interoperable Subset (UCMIS)?