XML Schema Conversion

From Developer Documents
Revision as of 12:22, 17 March 2016 by Marko Luukkainen (talk | contribs) (→‎Unrecognized child elements)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Simantics XML-Schema Conversion version 0.1

Plugins:

  • org.simantics.xml.sax
  • org.simantics.xml.sax.base
  • org.simantics.xml.sax.ui

Summary

Simantics XML-Schema conversion creates:

  • Simantics ontology as .pgraph file
  • Java classes for SAX based parser

Schema conversion does not support:

  • XML file export
  • Many of the XML schema definitions
  • Group definitions
  • Attribute constraining facets

Notes

This work was done in PDS Integration project in co-operation with VTT and Fortum. Schema conversion was used for converting Proteus 3.6.0 XML Schema to Simantics ontology. Due to limited scope of the schema, the converter supports only limited part of the XML Schema definitions.

Ontology definitions based on XML schema concepts

XML Schema conversion creates types and relations based on XML schema concepts that are used in the conversion

Hard-coded ontology definition Notes
hasAttribute <R L0.HasPropertyBase relation for all element/attribute relations
hasID <R hasAttribute Base relation for IDs (Attributes with xsd:ID type)
ComplexType <T L0.Entity Base type for ComplexTypes
hasComplexType <R L0.IsComposedOf Base relation for containing elements that inherit the specified ComplexType
AttributeGroup <T L0.Entity Base type for AttributeGroups
Element <T L0.Entity Base type for Elements
hasElement <R L0.IsComposedOf Base relation for containing elements
ElementList <T L0.List Base type for Lists containing Elements (storing the order of the elements)
hasElementList <R L0.IsComposedOf Base relation for element containing element lists. Used for creating Element type specific lists.
hasOriginalElementList <R hasElementList Relation for element containing element lists. Stores the order of the all the child elements.
hasReference <R L0.IsRelatedTo Base relation for object references (converted ID references)
hasExternalReference <R L0.IsWeaklyRelatedTo Relation for references between data imported from different files. Note: external references must be created with post process functions, since schema conversion itself is not able to resolve references between different imports.

Datatypes

XML uses three types of attributes, Atomic, List, and Union. Current XML Schema conversion support only Atomic attributes.

XML datatype Simantics Notes
Atomic Supported
List Not Supported
Union Not supported

Primitive attributes are converted to Layer0 literals. List of primitive datatypes and respective literal types is:

XML datatype Simantics Notes
string L0.String
boolean L0.Boolean
decimal L0.Double
float L0.Float
double L0.Double
duration
dateTime
time L0.String
date L0.String
gYearMonth
gYear
gMonthDay
gDay
gMonth
hexBinary
base64Binary
anyUri L0.Uri
QName
NOTATION

Other built-in datatypes are converted to Layer0 literal types as well:

XML datatype Simantics Notes
normalizedString L0.String
token L0.String
language
NMTOKEN L0.String
Name
NCName
ID L0.String ID attributes use XML.hasID property relation. An element is expected to have only one attribute with ID type.
IDREF L0.String
IDREFS
ENTITY
ENTITIES
integer L0.Integer
nonPositiveInteger L0.Integer
negativeInteger L0.Integer
long L0.Long
int L0.Integer
short L0.Integer
byte L0.Byte
nonNegativeInteger L0.Integer
unsignedLong L0.Long
unsignedShort L0.Integer
unsignedByte L0.Byte
positiveInteger L0.Integer
yearMonthDuration
dayTimeDuration
dateTimeStamp

XML schema allows defining new attribute types with constraining facets. Constraining facets are not currently supported.

XML Constraining facets Simantics Notes
length
minLength
maxLength
pattern
enumeration
whitespace
maxInclusive
maxExclusive
minInclusive
minExclusive
totalDigits
fractionDigits
Assertions
explicitTimeZone

In addition, individual attributes can be converted to a single array with Attribute Composition rule. Supported array datatypes are:

Conversion configuration Simantics
doubleArray L0.DoubleArray
stringArray L0.StringArray

Structures

Type definitions

SimpleType

XML schema allows SimpleTypes to be used as Element types for elements without child elements or as attribute types. When simpleType is used as attributes, the type will be converted to functional property relation: <source lang="xml"> <xsd:simpleType name="LengthUnitsType">

 <xsd:restriction base="xsd:NMTOKEN">
   <xsd:enumeration value="mm"/>
   …			
 </xsd:restriction>

</xsd:simpleType>

<xsd:element name="UnitsOfMeasure">

 <xsd:annotation>
   <xsd:documentation>These are from …</xsd:documentation>
 </xsd:annotation>
 <xsd:complexType>
   <xsd:attribute name="Distance" type="LengthUnitsType" default="Millimetre">
   </xsd:attribute>

</source>

PRO.hasLengthUnitsType <R PRO.XML.hasAttribute : L0.FunctionalRelation
   --> L0.String

PRO.hasUnitsOfMeasure <R PRO.XML.hasElement
PRO.hasUnitsOfMeasureList <R PRO.XML.hasElementList
PRO.UnitsOfMeasure <T PRO.XML.Element
PRO.UnitsOfMeasure.hasDistance <R PRO.XML.hasAttribute: L0.FunctionalRelation
   <R PRO.hasLengthUnitsType

When simpleType is used as definition of Element, the definition is converted to inheritance from the base literal type. In the following example, Knot elements xsd:double base is converted to inheritance to L0.Double: <source lang="xml"> <xsd:element name="Knot" maxOccurs="unbounded">

 <xsd:simpleType>
   <xsd:restriction base="xsd:double">
     <xsd:minInclusive value="0.0"/>
  </xsd:restriction>
</xsd:simpleType>

</xsd:element> </source>

PRO.ComplexTypes.hasKnots.Knot <R PRO.XML.hasElement
PRO.ComplexTypes.hasKnots.KnotList <R PRO.XML.hasElementList
PRO.ComplexTypes.Knots.Knot <T PRO.XML.Element <T L0.Double

ComplexType

Schema conversion creates hard-coded ComplexType entity as a base type for ComplexTypes.

PRO.XML.ComplexType <T L0.Entity

ComplexTypes that are defined in the input schema are converted to L0.Entities, which inherit the hard-coded ComplexType, and are put into “ComplexTypes” library. Conversion also generates ComplexType specific generic relation for composition, and another relation for lists. Particles of a ComplexType are converted to ComplexType and particle specific relations inheriting the particle type related relation. Also, Attributes of the ComplexType are converted to the ComplexType and Attribute specific relations. For example, ComplexType “PlantItem” is converted to “ComplexTypes.PlantItem” entity, it has a “ComplexTypes.hasPlantItem” composition relation, and “ComplexTypes.hasPlantItemList” relation for lists. “ID” attribute is converted to “ComplexTypes.PlantItem.hasID” functional relation, and choice particle “Presentation” is converted to “ComplexTypes.PlantItem.hasPresentation” relation. <source lang="xml"> <xsd:complexType name="PlantItem">

 <xsd:choice minOccurs="0" maxOccurs="unbounded">
   <xsd:element ref="Presentation"/>
   <xsd:element ref="Extent"/>
   …
   <xsd:element name="ModelNumber" type="xsd:string"/>
   …
 </xsd:choice>
 <xsd:attribute name="ID" type="xsd:ID" use="required"/>
 <xsd:attribute name="TagName" type="xsd:string"/>
 …
 <xsd:attribute name="ComponentType">
  <xsd:simpleType>
   <xsd:restriction base="xsd:NMTOKEN">
     <xsd:enumeration value="Normal"/>
     <xsd:enumeration value="Explicit"/>
     <xsd:enumeration value="Parametric"/>
   </xsd:restriction>
  </xsd:simpleType>
 </xsd:attribute>
 …

</xsd:complexType> </source>

PRO.ComplexTypes.PlantItem <T PRO.XML.ComplexType
PRO.ComplexTypes.hasPlantItem <R PRO.XML.hasComplexType
PRO.ComplexTypes.hasPlantItemList <R PRO.XML.hasElementList
   --> PRO.ComplexTypes.PlantItem
PRO.ComplexTypes.PlantItem.hasPresentation <R PRO.hasPresentation
   --> PRO.Presentation
PRO.ComplexTypes.PlantItem.hasExtent <R PRO.hasExtent
   --> PRO.Extent
…
PRO.ComplexTypes.PlantItem.hasID <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.String
…
PRO.ComplexTypes.PlantItem.hasComponentType <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.String

Element

Element definitions are processed similarly to ComplexTypes, but the converted types are put directly into the ontology without any library. Hence, Element “PlantModel” is converted to “PlantModel” entity. <source lang="xml"> <xsd:element name="PlantModel">

 <xsd:complexType>
   <xsd:sequence>
     <xsd:element ref="PlantInformation"/>
     <xsd:element ref="Extent"/>
     <xsd:any namespace="##targetNamespace" maxOccurs="unbounded"/>
   </xsd:sequence>
 </xsd:complexType>

</xsd:element> </source>

PRO.hasPlantModel <R PRO.XML.hasElement
PRO.hasPlantModelList <R PRO.XML.hasElementList
PRO.PlantModel <T PRO.XML.Element
PRO.PlantModel.hasPlantInformation <R PRO.hasPlantInformation
   --> PRO.PlantInformation
PRO.PlantModel.hasExtent <R PRO.hasExtent
   --> PRO.Extent

When Element definition is defined with ComplexContent, ComplexContent’s extension’s base is converted to L0.Inheritance relation between the types. For example “Equpiment” Element has “PlantItem” as a base extension, so “Equipment” entity is inherited from “PlantItem” entity. <source lang="xml"> <xsd:element name="Equipment">

 <xsd:complexType>
   <xsd:complexContent>
     <xsd:extension base="PlantItem">
       <xsd:choice minOccurs="0" maxOccurs="unbounded">
         <xsd:element ref="Discipline" minOccurs="0"/>
         <xsd:element ref="MinimumDesignPressure"/>
         …
         <xsd:element ref="Equipment"/>
         …
       </xsd:choice>
       <xsd:attribute name="ProcessArea" type="xsd:string"/>
       <xsd:attribute name="Purpose" type="xsd:string"/>
     </xsd:extension>
   </xsd:complexContent>
 </xsd:complexType>

</xsd:element> </source>

PRO.hasEquipment <R PRO.XML.hasElement
PRO.hasEquipmentList <R PRO.XML.hasElementList
PRO.Equipment <T PRO.XML.Element <T PRO.PlantItem
PRO.Equipment.hasProcessArea <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.String
PRO.Equipment.hasPurpose <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.String
PRO.Equipment.hasDiscipline <R PRO.hasDiscipline
   --> PRO.Discipline
PRO.Equipment.hasMinimumDesignPressure <R PRO.hasMinimumDesignPressure
   --> PRO.MinimumDesignPressure
…
PRO.Equipment.hasEquipment <R PRO.hasEquipment
   --> PRO.Equipment
…

Indicators (choice, sequence, all)

When Indicators have maxOccurs larger than 1, relations generated according to particles have no multiplicity restrictions (ass all previous examples are defined). When indicator is choice with maxOccurs=1 (default value for maxOccurs), the particle relations is expected to refer to only one object that conforms to one of the specified types. For example, Element “TrimmedCurve” has choice indicator with 4 elements (“Circle”, “PCircle”, “Ellipse”, “PEllipse), and that choice is converted to “TrimmedCurve.hasCircleOrPCircleOrEllipseOrPEllipse” relation. <source lang="xml"> <xsd:element name="TrimmedCurve" substitutionGroup="Curve">

 <xsd:complexType>
   <xsd:complexContent>
     <xsd:extension base="Curve">
       <xsd:sequence>
         <xsd:choice>
           <xsd:element ref="Circle"/>
           <xsd:element ref="PCircle"/>
           <xsd:element ref="Ellipse"/>
           <xsd:element ref="PEllipse"/>
         </xsd:choice>
         <xsd:element ref="GenericAttributes" minOccurs="0"/>
       </xsd:sequence>
       <xsd:attribute name="StartAngle" type="xsd:double" use="required"/>
       <xsd:attribute name="EndAngle" type="xsd:double" use="required"/>
     </xsd:extension>
   </xsd:complexContent>
 </xsd:complexType>

</xsd:element> </source>

PRO.hasTrimmedCurve <R PRO.XML.hasElement
PRO.hasTrimmedCurveList <R PRO.XML.hasElementList
PRO.TrimmedCurve <T PRO.XML.Element <T PRO.Curve
PRO.TrimmedCurve.hasStartAngle <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.Double
PRO.TrimmedCurve.hasEndAngle <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.Double
PRO.TrimmedCurve.hasCircleOrPCircleOrEllipseOrPEllipse <R PRO.hasCircle <R PRO.hasPCircle <R PRO.hasEllipse <R PRO.hasPEllipse
   --> PRO.Circle
   --> PRO.PCircle
   --> PRO.Ellipse
   --> PRO.PEllipse
PRO.TrimmedCurve.hasGenericAttributes <R PRO.hasGenericAttributes
   --> PRO.GenericAttributes

Note that Model Group definitions are not currently supported!

Customization via configuration

Attribute composition

Attribute composition rule allows converting separate attributes into one array. For example, following rule: <source lang="xml"> <AttributeComposition Name="XYZ" Type = "doubleArray">

 <Attribute Name="X" Type ="double"/>
 <Attribute Name="Y" Type ="double"/>
 <Attribute Name="Z" Type ="double"/>

</AttributeComposition> </source> causes “X”, “Y” and “Z” double attributes in “Coordinate” Element definition <source lang="xml"> <xsd:element name="Coordinate">

 <xsd:complexType>
   <xsd:attribute name="X" type="xsd:double" use="required"/>
   <xsd:attribute name="Y" type="xsd:double" use="required"/>
   <xsd:attribute name="Z" type="xsd:double"/>
 </xsd:complexType>

</xsd:element> </source> to be converted to “XYZ” double array:

PRO.Coordinate <T PRO.XML.Element
PRO.Coordinate.hasXYZ <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.DoubleArray

ID references

Referencing other XML elements is usually done using element IDs. Using ID Provider and ID Reference rules allows converting these references to statements in Simantics DB. ID Provider rule is used for retrieving the ID from referred objects. The rule does not affect the generated ontology. <source lang="xml"> <IDProvider>

 <ComplexType Name = "PlantItem"/>
 <Attribute Name="ID" Type ="string"/>

</IDProvider> </source> ID Reference rule is used for objects that use ID references. ID Source tells which attribute is used to refer another Element, and Reference defines the name of the relation. With the following rule: <source lang="xml"> <IDReference>

 <Element Name ="Connection"/>
 <IDSource Name="ToID" Type ="string"/>
 <Reference Name="ToIDRef" Type ="ref"/>

</IDReference> </source> “Connection” element definition’s “ToID” reference is converted to ToIDRef relation. <source lang="xml"> <xsd:element name="Connection">

 <xsd:complexType>
   <xsd:attribute name="ToID" type="xsd:string"/>
   …
 </xsd:complexType>

</xsd:element> </source> The original attribute is kept, so that if ID reference cannot be located, the information about the reference still exists.

PRO.Connection <T PRO.XML.Element
PRO.Connection.hasToID <R PRO.XML.hasAttribute: L0.FunctionalRelation
   --> L0.String
PRO.Connection.ToIDRef <R PRO.XML.hasReference

In imported data, the reference statement will point to referred Element, if the parser is able to locate a Element with the given ID.

Predicate Object Graph
Basic information
InstanceOf Connection DB
Is Related To
hasToID V_02_N6 (edit) DB
ToIDRef $412442 : (Nozzle) DB
Other statements
hasConnection/Inverse $416145 : (PipingNetworkSegment) DB

Ordered child

Ordered child rule allows storing the original order of the elements into lists. The rules either force the creating of the lists (used when the schema is interpreted to be indifferent of the order), or disabling the list generation. Currently the rule hat two types, original and child. An original type rule sets if all the child elements are out into “OriginalElementList”. An child rule sets if the child elements are added to type specific lists. <source lang="xml"> <OrderedChild Type="original" Value="disable">

 <ComplexType Name = "PlantItem"/>

</OrderedChild>

<OrderedChild Type="child" Value="disable">

 <ComplexType Name = "PlantItem"/>

</OrderedChild> </source>

Unrecognized child elements

Unrecognized child element rule allows processing XML files that do not conform to given schema, and use different element names. In practice, the rule allows injecting Java code to generated parser classes. The code is put into method, which signature is:

public void configureChild(WriteGraph graph, Deque<Element> parents, Element element, Element child) throws DatabaseException

The method is called with ”element” as the element, which conforms to given type in the rule’s configuration, and ”child” is the child element, that could not be recognized. The following example is used for handling incorrect files, which have replaced element name with the contents of attribute “name”. <source lang="xml"> <UnrecognizedChildElement>

 <Element Name ="GenericAttributes"/>
 <JavaMethod>
 // Some commercial software do not handle GenericAttribute elements properly:
 // they use "Name" attribute's value as element's name.
 GenericAttribute ga = new GenericAttribute();
 java.util.List<Attribute> attributes = new java.util.ArrayList<Attribute>();
 attributes.addAll(child.getAttributes());
 attributes.add(new Attribute("Name", "Name", "", child.getQName()));
 Element newChild = new Element("", "", "GenericAttribute", attributes);
 newChild.setParser(ga);
 Resource res = ga.create(graph, newChild);
 if (res != null) {
   newChild.setData(res);
   parents.push(element);
   ga.configure(graph, parents, newChild);
   connectChild(graph, element, newChild);
   parents.pop();
 }
 </JavaMethod>

</UnrecognizedChildElement> </source> An example of incorrect file: <source lang="xml"> <GenericAttributes Number="28" Set="Values">

 <Assembly Format="string" Value="5. Auxiliary Steam System" />
 <Bendingradiusrtube Format="double" Value="0" Units="mm" ComosUnits="mm M01.15" />
 <CostCode Format="double" Value="607" />

</source> When the content should be: <source lang="xml"> <GenericAttributes Number="28" Set="Values">

 <GenericAttribute Name="Assembly" Format="string" Value="5. Auxiliary Steam System" />
 <GenericAttribute Name="Bendingradiusrtube" Format="double" Value="0" Units="mm“ />
 <GenericAttribute Name="CostCode" Format="double" Value="607" />

</source>

References

  1. W3C XML Schema definition language (XSD) 1.1 Part 1: Structures http://www.w3.org/TR/xmlschema11-1/
  2. W3C XML Schema definition language (XSD) 1.1 Part 2: Datatypes http://www.w3.org/TR/xmlschema11-2/
  3. Layer0 specification http://dev.simantics.org/images/c/c8/Layer0.pdf