Chapter 8 XML Data Modeling

Ähnliche Dokumente
Database Systems Unit 7. Logical Modeling. DB / PRM1 / db07_logicalmodelling.ppt / Version Learning Goals

XML exemplarisch. nach: André Bergholz, Extending Your Markup: An XML Tutorial IEEE Internet Computing, Jul./Aug. 2000, 74 79

XML-Schema Datentypen

DTDs und XML-Schema. Robert Tolksdorf Freie Universität Berlin Institut für Informatik Netzbasierte Informationssysteme

XML Schema vs. Relax NG

Hauptseminar Informatik. Die Strukturierung von XML-Dokumenten mit Hilfe von DTDs und XML Schema

XML Vorlesung ETHZ SS XML Vorlesung ETHZ, Sommersemester

XML Schema S 2010/2011 a W _d Seite 1 h

Zusammenfassung XML. Metasprache um Dokumenttypen zu definieren

Unit 1. Motivation and Basics of Classical Logic. Fuzzy Logic I 6

Mitglied der Leibniz-Gemeinschaft

XML-Übersicht. Von HTML zu XML

XML-Übersicht. Von HTML zu XML. Nutzen von XML. Geschichte von XML. Eigenschaften von XML. Nutzen von XML (extensible Markup Language)

DPM_flowcharts.doc Page F-1 of 9 Rüdiger Siol :28

Level 1 German, 2014

p^db=`oj===pìééçêíáåñçêã~íáçå=

Level 2 German, 2015

XML Schema 2012/2013 S Seite 1 h_da W

Java Tools JDK. IDEs. Downloads. Eclipse. IntelliJ. NetBeans. Java SE 8 Java SE 8 Documentation

ONLINE LICENCE GENERATOR

prorm Budget Planning promx GmbH Nordring Nuremberg

eurex rundschreiben 094/10

Geometrie und Bedeutung: Kap 5

Tube Analyzer LogViewer 2.3

XML Schema 2012/2013 S Seite 1 h_da W

Creating OpenSocial Gadgets. Bastian Hofmann

Ideal Message Schweiz EDI Implementation Documentation

XML Schema 2016 S h_da S Seite 1

Chapter 2 Object-Relational Views and Collection Types

STRATEGISCHES BETEILIGUNGSCONTROLLING BEI KOMMUNALEN UNTERNEHMEN DER FFENTLICHE ZWECK ALS RICHTSCHNUR FR EIN ZIELGERICHTETE

ISO Reference Model

LiLi. physik multimedial. Links to e-learning content for physics, a database of distributed sources

ColdFusion 8 PDF-Integration

General info on using shopping carts with Ogone

Notice: All mentioned inventors have to sign the Report of Invention (see page 3)!!!

VGM. VGM information. HAMBURG SÜD VGM WEB PORTAL - USER GUIDE June 2016

Level 2 German, 2013

A Classification of Partial Boolean Clones

Developing SQL Databases MOC 20762

Kursbuch Naturheilverfahren: Curriculum der Weiterbildung zur Erlangung der Zusatzbezeichnung Naturheilverfahren (German Edition)

HIR Method & Tools for Fit Gap analysis

PONS DIE DREI??? FRAGEZEICHEN, ARCTIC ADVENTURE: ENGLISCH LERNEN MIT JUSTUS, PETER UND BOB

Level 1 German, 2016

NEWSLETTER. FileDirector Version 2.5 Novelties. Filing system designer. Filing system in WinClient

Conceptual Data Modeling "Enhanced Entity-Relationship"

Employment and Salary Verification in the Internet (PA-PA-US)

FAHRZEUGENTWICKLUNG IM AUTOMOBILBAU FROM HANSER FACHBUCHVERLAG DOWNLOAD EBOOK : FAHRZEUGENTWICKLUNG IM AUTOMOBILBAU FROM HANSER FACHBUCHVERLAG PDF

J RG IMMENDORFF STANDORT F R KRITIK MALEREI UND INSPIRATION ERSCHEINT ZUR AUSSTELLUNG IM MUSEUM LU

XML Template Transfer Transfer project templates easily between systems

Konzeptionelle Modellierung von XML-Dokumenttypen

Unterspezifikation in der Semantik Hole Semantics

Abschnitt 1. BPM als Lingua franca. Management, Fachbereiche und IT Ist BPM ein Weg zur (Auf-)Lösung der Sprachbarriere?

DATA ANALYSIS AND REPRESENTATION FOR SOFTWARE SYSTEMS

ISO Reference Model

Die Datenmanipulationssprache SQL

Namespaces. Namensräume...

Harry gefangen in der Zeit Begleitmaterialien

The verbose-note style

Mock Exam Behavioral Finance

Comparison of the WB- and STAMP-Analyses. Analyses of the Brühl accident. 2. Bieleschweig Workshop. Dipl.-Ing. Oliver Lemke. Lemke,

Context-adaptation based on Ontologies and Spreading Activation

AS Path-Prepending in the Internet And Its Impact on Routing Decisions

<B1>Skript</B1> Skript zur Vorlesung Auszeichnungssprachen Stand: September 2016

p^db=`oj===pìééçêíáåñçêã~íáçå=

2a) Rekursion (zugeschnitten auf Anfrage) (C) Prof. E. Rahm Universität Leipzig

Im Fluss der Zeit: Gedanken beim Älterwerden (HERDER spektrum) (German Edition)

VGM. VGM information. HAMBURG SÜD VGM WEB PORTAL USER GUIDE June 2016

ETHISCHES ARGUMENTIEREN IN DER SCHULE: GESELLSCHAFTLICHE, PSYCHOLOGISCHE UND PHILOSOPHISCHE GRUNDLAGEN UND DIDAKTISCHE ANSTZE (GERMAN

Semistrukturierte Datenmodelle. XML Datenmodelle. XML Infoset. OEM Beispielbaum. Document InfoItem. Beispiel

Automatentheorie und formale Sprachen reguläre Ausdrücke

Darstellung und Anwendung der Assessmentergebnisse

Einführung in Datenbanken - XML - Philipp Cimiano AG Semantische Datenbanken und Wissensverarbeitung

PRÜFUNG AUS SEMISTRUKTURIERTE DATEN Kennnr. Matrikelnr. Familienname Vorname

Martin Luther. Click here if your download doesn"t start automatically

Unit 4. The Extension Principle. Fuzzy Logic I 123

XML Schemasprachen. Übersicht und Einordnung. Übersicht. ! DTDs und XML Schema. ! Schemasprachen als Konzept. ! Schemasprachen selbstgemacht

Level 1 German, 2012

Vorlesung Datenbanken II SS 2006

Level 2 German, 2016

Dynamische Programmiersprachen. David Schneider STUPS

Causal Analysis in Population Studies

SAMPLE EXAMINATION BOOKLET

Einsatz einer Dokumentenverwaltungslösung zur Optimierung der unternehmensübergreifenden Kommunikation

Ingenics Project Portal

Level 1 German, 2013

DAS ERSTE MAL UND IMMER WIEDER. ERWEITERTE SONDERAUSGABE BY LISA MOOS

Exercise (Part II) Anastasia Mochalova, Lehrstuhl für ABWL und Wirtschaftsinformatik, Kath. Universität Eichstätt-Ingolstadt 1

Objektorientierte Datenbanken

iid software tools QuickStartGuide iid USB base driver installation

XML, XML-DTDs, und XML Schema

ELBA2 ILIAS TOOLS AS SINGLE APPLICATIONS

... Chair of Mobile Business & Multilateral Security. Lecture 14 Business Informatics 2 (PWIN) FAQ WS 2015/16

Sharing of OID Registry Information

Objekt-relationales Datenbanksystem Oracle

time marker cluster term term URL Link to AEC media

Hausaufgabe 1-4. Name: If homework late, explanation: Last class homework is being accepted: If correction late, explanation: Student Self-Grading

XML. Steffen Staab ISWeb Informationssysteme & Semantic Web

Was heißt Denken?: Vorlesung Wintersemester 1951/52. [Was bedeutet das alles?] (Reclams Universal-Bibliothek) (German Edition)

Customer-specific software for autonomous driving and driver assistance (ADAS)

Transkript:

Chapter 8 XML Data Modeling Prof. Dr.-Ing. Stefan Deßloch Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de 1 Inhalt Überblick I. Objektorientierung und Erweiterbarkeit 1. Benutzerdefinierte Datentypen und getypte Tabellen 2. Objekt-relationale Sichten und Kollektionstypen 3. Benutzerdefinierte Routinen und Objektverhalten 4. Anbindung an Anwendungsprogramme 5. Objekt-relationales SQL und Java II. Online Analytic Processing 6. Datenanalyse in SQL 7. Windows und Query Functions III. XML 8. XML Datenmodellierung 9. SQL/XML 10. XQuery 2

XML Origin and Usages Defined by the WWW Consortium (W3C) Originally intended as a document markup language, not a database language Documents have tags giving extra information about sections of the document For example: <title> XML </title> <slide> XML Origin and Usages </slide> Derived from SGML (Standard Generalized Markup Language) standard for document description enables document interchange in publishing, office, engineering, main idea: separate form from structure XML is simpler to use than SGML roughly 20% complexity achieves 80% functionality 3 XML Origin and Usages (cont.) XML documents are to some extent self-documenting Tags can be used as metadata Example <bank> <account> <account-number> A-101 </account-number> <branch-name> Downtown </branch-name> <balance> 500 </balance> </account> <depositor> <account-number> A-101 </account-number> <customer-name> Johnson </customer-name> </depositor> </bank> 4

Forces Driving XML Document Processing Goal: use document in various, evolving systems structure content layout grammer: markup vocabulary for mixed content Data Bases and Data Exchange Goal: data independence structured, typed data schema-driven integrity constraints Semi-structured Data and Information Integration Goal: integrate autonomous data sources data source schema not known in detail schemata are dynamic schema might be revealed through analysis only after data processing 5 XML Language Specifications XSL XML Link XML Pointer XPath XQuery XSLT XSL-FO XML Metadata Interchange XML Schema XML Namespace XHML extensible Markup Language Cascading Style Sheets Unified Modeling Language Meta Object Facility Unicode Standardized Generalized Markup Language Document Type Definition 6

Describing XML Data: Basics - Elements Tag: label for a section of data Element: section of data beginning with <tagname> and ending with matching </tagname> Elements must be properly nested Formally: every start tag must have a unique matching end tag, that is in the context of the same parent element. Every document must have a single top-level element Mixture of text with sub-elements is legal in XML Example: <account> This account is seldom used any more. <account-number> A-102</account-number> <branch-name> Perryridge</branch-name> <balance>400 </balance> </account> Useful for document markup, but discouraged for data representation 7 XML Document Example <bank-1> <customer> <customer-name> Hayes </customer-name> <customer-street> Main </customer-street> <customer-city> Harrison </customer-city> <account> <account-number> A-102 </account-number> <branch-name> Perryridge </branch-name> <balance> 400 </balance> </account> <account> </account> </customer>.. </bank-1> 8

Describing XML Data: Attributes Attributes: can be used to describe elements Elements can have attributes <account acct-type = checking > <account-number> A-102 </account-number> <branch-name> Perryridge </branch-name> <balance> 400 </balance> </account> Attributes are specified by name=value pairs inside the starting tag of an element Attribute names must be unique within the element <account acct-type = checking monthly-fee= 5 > 9 Attributes vs. Subelements Distinction between subelement and attribute In the context of documents, attributes are part of markup, while subelement contents are part of the basic document contents In the context of data representation, the difference is unclear and may be confusing Same information can be represented in two ways <account account-number = A-101 >. </account> <account> <account-number>a-101</account-number> </account> 10

More on XML Syntax Shortcut: elements without subelements or text content can be abbreviated by ending the start tag with a /> and deleting the end tag <account number= A-101 branch= Perryridge balance= 200 /> To store string data that may contain tags, without the tags being interpreted as subelements, use CDATA as below <![CDATA[<account> </account>]]> Here, <account> and </account> are treated as just strings 11 XML Document Schema Metadata and database schemas constrain what information can be stored, and the data types of stored values Metadata are very important for data exchange Guarantees automatic and correct data interpretation XML documents are not required to have associated metadata/schema Two mechanisms for specifying XML schema Document Type Definition (DTD) XML Schema 12

Describing XML Data: DTD Type and structure of an XML document can be specified using a DTD What elements can occur What attributes can/must an element have What subelements can/must occur inside each element, and how many times. DTD does not constrain data types All values represented as strings in XML DTD syntax <!ELEMENT element (subelements-specification) > <!ATTLIST element (attributes) > 13 Element Specification in DTD Subelements can be specified as names of elements, or #PCDATA (parsed character data), i.e., character strings EMPTY (no subelements) or ANY (anything can be a subelement) Example <! ELEMENT depositor (customer-name account-number)> <! ELEMENT customer-name(#pcdata)> <! ELEMENT account-number (#PCDATA)> Subelement specification may have regular expressions <!ELEMENT bank ( ( account customer depositor)+)> Notation: - alternatives? - 0 or 1 occurrence + - 1 or more occurrences * - 0 or more occurrences 14

Example: Bank DTD <!DOCTYPE bank [ <! ELEMENT bank ( ( account customer depositor)+)> <! ELEMENT account (account-number branch-name balance)> <! ELEMENT customer(customer-name customer-street customer-city)> <! ELEMENT depositor (customer-name account-number)> <! ELEMENT account-number (#PCDATA)> <! ELEMENT branch-name (#PCDATA)> <! ELEMENT balance(#pcdata)> <! ELEMENT customer-name(#pcdata)> <! ELEMENT customer-street(#pcdata)> <! ELEMENT customer-city(#pcdata)> ]> 15 Attribute Specification in DTD Attribute specification : for each attribute Name Type of attribute CDATA ID (identifier) or IDREF (ID reference) or IDREFS (multiple IDREFs) Whether mandatory (#REQUIRED) default value (value), or neither (#IMPLIED) Examples <!ATTLIST account acct-type CDATA checking > <!ATTLIST customer customer-id ID #REQUIRED accounts IDREFS #REQUIRED > 16

IDs and IDREFs An element can have at most one attribute of type ID The ID attribute value of each element in an XML document must be distinct ID attribute (value) is an object identifier An attribute of type IDREF must contain the ID value of an element in the same document An attribute of type IDREFS contains a set of (0 or more) ID values. Each ID value must contain the ID value of an element in the same document IDs and IDREFs are untyped, unfortunately Example below: The owners attribute of an account may contain a reference to another account, which is meaningless; owners attribute should ideally be constrained to refer to customer elements 17 Example: Extended Bank DTD Bank DTD with ID and IDREF attribute types <!DOCTYPE bank-2[ <!ELEMENT account (branch-name, balance)> <!ATTLIST account account-number ID #REQUIRED owners IDREFS #REQUIRED> <!ELEMENT customer(customer-name, customer-street, customer-city)> <!ATTLIST customer customer-id ID #REQUIRED accounts IDREFS #REQUIRED> declarations for branch, balance, customer-name, customer-street and customer-city ]> 18

XML data with ID and IDREF attributes <bank-2> <account account-number= A-401 owners= C100 C102 > <branch-name> Downtown </branch-name> <balance>500 </balance> </account>... <customer customer-id= C100 accounts= A-401 > <customer-name>joe</customer-name> <customer-street>monroe</customer-street> <customer-city>madison</customer-city> </customer> <customer customer-id= C102 accounts= A-401 A-402 > <customer-name> Mary</customer-name> <customer-street> Erin</customer-street> <customer-city> Newark </customer-city> </customer> </bank-2> 19 Describing XML Data: XML Schema XML Schema is closer to the general understanding of a (database) schema XML Schema supports Typing of values E.g. integer, string, etc Constraints on min/max values Typed references User defined types Specified in XML syntax (unlike DTDs) Integrated with namespaces Many more features List types, uniqueness and foreign key constraints, inheritance.. BUT: significantly more complicated than DTDs 20

XML Schema Structures Datatypes (Part 2) Describes Types of scalar (leaf) values Structures (Part 1) Describes types of complex values (attributes, elements) Regular tree grammars repetition, optionality, choice recursion Integrity constraints Functional (keys) & inclusion dependencies (foreign keys) Subtyping (akin to OO models) Describes inheritance relationships between types Supports schema reuse 21 XML Schema Structures (cont.) Elements : tag name & simple or complex type <xs:element name= sponsor type= xsd:string /> <xs:element name= action type= Action /> Attributes : tag name & simple type <xs:attribute name= date type= xsd:date /> Complex types <xs:complextype name= Action > <xs:sequence> <xs:elemref name = action-date /> <xs:elemref name = action-desc /> </xs:sequence> </xs:complextype> 22

XML Schema Structures (cont.) Sequence <xs:sequence> <xs:element name= congress type=xsd:string /> <xs:element name= session type=xsd:string /> </xs:sequence> Choice <xs:choice> <xs:element name= author type= PersonName /> <xs:element name= editor type= PersonName /> </xs:choice> Repetition <xs:sequence minoccurs= 1 maxoccurs= unbounded > <xs element name = section type= Section /> </xs:sequence> 23 Namespaces A single XML document may contain elements and attributes defined for and used by multiple software modules Motivated by modularization considerations, for example Name collisions have to be avoided Example: A Book XSD contains a Title element for the title of a book A Person XSD contains a Title element for an honorary title of a person A BookOrder XSD reference both XSDs Namespaces specifies how to construct universally unique names 24

Namespaces (cont.) Namespace is a collection of names identified by a URI Namespaces are declared via a set of special attributes These attributes are prefixed by xmlns - Example: <BookOrder xmlns:customer="http://mysite.com/person" xmlns:item="http://yoursite.com/book"> Namespace applies to the element where it is declared, and all elements within its content unless overridden Elements/attributes from a particular namespace are prefixed by the name assigned to the namespace in the corresponding declaration of the using XML document Customer:Title='Dr' Item:Title='Introduction to XML' Default namespace declaration for fixing the namespace of unqualified names Example: <BookOrder xmlns="http://mysite.com/person" xmlns:item="http://yoursite.com/book"> 25 Namespaces and XML Schema XML-Schema elements and data types are imported from the XML-Schema namespace http://www/w3/org/2001/xmlschema xsd is generally used as a prefix The vocabulary defined in an XML Schema file belongs to a target namespace declared using the targetnamespace attribute declaring a target namespace is optional if none is provided, the vocabulary does not belong to a namespace required for creating XML schemas for validating (pre-namespace) XML1.0 documents XML document using an XML schema declares namespace, refers to the target namespace of the underlying schema can provide additional hints where an XML schema (xsd) file for the namespace is located schemalocation attribute 26

XML Schema Version of Bank DTD <xsd:schema xmlns:xsd="http://www.w3.org/2001/xmlschema" targetnamespace="http://www.banks.org" xmlns ="http://www.banks.org" > <xsd:element name= bank type= BankType /> <xsd:element name= account > <xsd:complextype> <xsd:sequence> <xsd:element name= account-number type= xsd:string /> <xsd:element name= branch-name type= xsd:string /> <xsd:element name= balance type= xsd:decimal /> </xsd:sequence> </xsd:complextype> </xsd:element>.. definitions of customer and depositor. <xsd:complextype name= BankType > <xsd:choice minoccurs="1" maxoccurs="unbounded"> <xsd:element ref= account /> <xsd:element ref= customer /> <xsd:element ref= depositor /> </xsd:choice> </xsd:complextype> </xsd:schema> 27 XML Document Using Bank Schema <bank xmlns="http://www.banks.org" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://www.banks.org Bank.xsd"> <account> <account-number> </account-number> <branch-name> </branch-name> <balance> </balance> </account> </bank> 28

Assertions in XML-Schema Uniqueness: UNIQUE-Element, KEY-Element forces uniqueness of attribute or element values <field> elemen(s) can be applied to/declared for specific parts of the XML document <selector> element Example: zip-codes should be unique for a combination of region and part number <unique name= constraint1 > <selector xpath= //regions/zip /> <field xpath= @code /> <field xpath= part/@number /> </unique> Some remarks NULL value semantics: nillable at the schema level, nil in the document <key> equivalent to <unique> and nillable="false" uniqueness can be reduced to XML fragments composite keys/unique elements 29 Mapping ER-Model -> XML Schema Mapping Entities 1:1 mapping to XML elements use <key> to represent ER key attributes street name anr <element name= ABT > <attribute name= anr type= string /> <attribute name= street type= string /> <attribute name= name type= string /> <key name= abt_pk > <selector xpath=.//abt/ /> <field xpath= @anr /> </key> ABT 30

Mapping 1:N Relationships Mapping alternatives Nesting <element name="abt"> <sequence> <element name="ang"> <attribute name="name" type="string"/> <attribute name="office" type="string"/> </sequence> <attribute name="street" type="string"/> <attribute name="name" type="string"/> Flat representation <element name= ABT > <sequence> <element ref= ANG > </sequence> <attribute name= street type= string /> <attribute name= name type= string /> 31 street name ABT 1 ABT-ANG N ANG name office anr pnr <element name= ANG > <attribute name= name type= string /> <attribute name= office type= string /> Primary/Foreign Keys Problem nesting alone is not sufficient for modeling a 1:n relationship element identity is required to avoid duplicate entries Foreign Keys guarantee referential integrity: <key> / <keyref> elements <element name="abt"> <sequence> <element name="ang"> <attribute name="pnr" type="string"/> <attribute name="name" type="string"/> <attribute name="office" type="string"/> <attribute name="abtid" type="string"/> </sequence> <attribute name="anr" type="string"/> <attribute name="name" type="string"/> <attribute name="street" type="string"/> <key name= abt_pk > <selector xpath=.//abt /> <field xpath= @anr /> </key> <key name= ang_uniq > <selector xpath=.//abt/ang /> <field xpath= @pnr /> </unique> <keyref name= abt_fk refer= abt_pk > <selector xpath=.//abt/ang /> <field xpath= @abtid /> </key> 32

Primary/Foreign Keys Advantages over ID/IDREF based on equality of data types composite keys locality, restricting scope to parts of the XML document Mapping of N:M relationships use <key>/<keyref> elements flat modeling plus "pointers" addition of helper element similar to mapping to relational model <element name= PROJ_ANG > <attribute name= pnr type= string /> <attribute name= jnr type= string /> bez location jnr PROJ N PROJ-ANG M pnr ANG name office 33 Example: Mapping N:M-Relationships Element mapping <element name= ANG > <attribute name= pnr type= string /> <attribute name= name type= string /> <attribute name= office type= string /> <attribute name= jnr type= string /> Referential constraints, composite keys <key name= ang_pk > <selector xpath=.//ang /> <field xpath= @pnr /> </key> <key name= proj_pk > <selector xpath=.//proj /> <field xpath= @jnr /> </key> <unique name= proj_ang_uq > <selector xpath=.//proj_ang /> <field xpath= @pnr /> <field xpath= @pnr /> </key> <element name= PROJ > <attribute name= jnr type= string /> <attribute name= bez type= string /> <attribute name= location type= string /> <attribute name= pnr type= string /> <keyref name= proj_fk refer= proj_pk > <selector xpath=.//proj_ang /> <field xpath= @jnr /> </keyref> <keyref name= ang_fk refer= ang_pk > <selector xpath=.//proj_ang /> <field xpath= @pnr /> </keyref> 34