Data Modeling Technique

Data Modeling Technique

A database is a collection of data. A database system is a database organized by a database management system. A database application system is an application making use of a database for data storage and manipulation.

A data model consists of data definition language and data manipulation language. The former defines the data structure of a database, and the latter defines how to access database for query and update. There are different kinds of data model including relational, object-oriented and XML.

An Ansi-architecture of data model consists of conceptual schema, logical schema and internal schema. The conceptual schema describes the constraints of data requirements in a diagram such as Extended Entity Relationship model and DTD Graph. The logical schema describes the data structure of the implementation of the conceptual schema such as relational in table structure, object-oriented in a class structure and XML in a tree structure. The internal schema describes the physical files implementation of logical schema such as B+ index files, hash files, inverted files, and multi-list files.

In data modeling, that is, when designing a database, each noun in the user requirement statements is a potential entity, and each verb is a potential relationship, Basically, an entity is a uniquely identifiable object, and a relationship between entities identifies the constraints between entities. A constraint means a data requirement or a rule that the database must follow in the application of the database,

The data semantics in the database describes the meaning of the data and its relationship with other data. For example, cardinality describes the data volume between entities. Weak entity describes the key dependency between weak entity and strong entity. Participation describes the mandatory or optional relationship between entities. Isa relationship describes the relationship between superclass and subclass entities. The generalization describes the relationship between multiple subclass entities with its superclass entity. The aggregation describes the fact that an aggregation entity must consist of its components in the form of entities and their relationship. A categorization describes the fact that each subclass data must be within one of its superclass entities. N-ary relationship describes that one entity or multiple entities related to each other.

Mapping an EER model into a relational schema is a forward engineering from design to implementation. Its procedure involves mapping each entity into a relation; each relationship into a data semantic as described in the above paragraph.

In database history, the First Generation database starts with Hierarchical / Network (Codasyl) data model starting in 60's. The Second Generation database is the Relational data model starting in 70's. The Third Generation database is called Post relational database such as Object-Oriented and XML data model starting in 90's.

In general, a database is a collection of data which can be organized by a DBMS(Database Management System). A database language consists of DDL(Data Definition Language) to define the data structure of a database schema, and DML(Data Manipulation Language) to define the data operation of a database program. A procedural DML means that stepwise database operations needed to operate on a database. A non-procedural DML means that one step operation can operate on database. A logical data independency means that the implementation of logical schema will not affect the design of the conceptual schema. Similarly, a physical data independency means that the implementation of physical files will not affect the design of the logical schema.

Relational is a data model. As a data model, it consists of conceptual schema such as Extended Entity Relationship model, logical schema such as relations, and physical schema such as an Indexed files etc. The conceptual schema describes paper design (blue print in users' mind) of a database. Logical schema describes the data structure (data definition language) of a database, and physical schema describes the physical files used to store data. Furthermore, an External schema provides an application program view of database schema.

In the case of relational data model, its conceptual schema is an EER model and which consists of the followings:

l In business requirements, each noun can be a potential entity (uniquely identified), and each verb can be a potential relationship. An not identifiable noun should be an attribute.

l Entities with the same key can be generalized in isa relationship, generalization or categorization.

l An Entity is an uniquely identifiable object

l An entity relates to another entity in a binary relationship. An entity relates to itself in a unary relationship. Three entities relate to each other in a ternary relationship.

l Data semantic implies the meaning and definition of data.

l The data value that is unknown or not applied is called null value.

l An entity is in a rectangle shape as shown below:

l A relationship is in Diamond shape as shown below:

l The data of a subset is inside a superset in isa relationship with the same domain value as shown below:

Father is a Male

l In relational database, all data semantics are implemented by the relationship of their keys.

For example, Refugee and non-refugee are disjoint generalized into boat people as shown below:

l Programmer and System Analyst are overlap generalized into professional as shown below:

A borrower of patient record can be categorized as either Department , Doctor or Hospital as shown below:

l An Aggregation is a Part of (component) data semantic such that a whole part exists only if its components exist. For example, a Booking Ticket must exist only if its components Customer(s) and Ticket(s) exist.

l A Total participation in relational database can be implemented by disallowing null value in Foreign key between two related relations. A Partial participation in relational database can be implemented by allowing null value in Foreign key between two related relations

l A Weak entity means that the Key of weak entity must concatenate the key of its strong entity. For example, the key of an entity Room is Hotel-name plus Room-no such that Hotel-name is a key of its strong entity Hotel.

The steps to map conceptual schema EER model into logical schema Relations are:

Step 1: Identify each data semantic in the EER model.

Step 2: Map the entities along with their attributes and keys into relations with correspondent attributes and keys. In general, each entity is mapped into a relation. Each relationship is mapped into a foreign key such the two related entities are mapped into two parent and child relations linked by the foreign key in the child relation. Each semantic has different data structure in the mapped relations. For example, the subclass entity is related with its superclass entity in isa relationship, and which can be mapped into subclass relation and superclass relation such that they have the same key and the key of subclass relation is also a foreign key refers to the key in the superclass relation. Similarly, isa relationship appears in generalization, and categorization in different format. Two subclass entities(relations) can be generalized into a superclass entity(relation). A subclass entity(relation) is categorized as a subset of a union set of its superclass entities(relations). A weak entity is mapped into a child relation which concatenates the key of its strong entity(relation). The relationship between two entities(relations) are mandatory in total participation while the relationship between two entities(relations) are optional in partial participation, and which can be differentiated by the null value of their foreign key. Null value in foreign key means partial participation; and non-null value means total participation. Three entities(relations) can be related to each other in ternary relationship. Their cardinality can be implemented by their key specification, and above all, the identification of the keys in their relationship relation which relates with other relations. A unary relationship is a entity(relation) relates to itself by using a foreign key in the same relation referring to the different data instances of the relation.

In practice, the objective of data modeling (database design) is to design a data model with conceptual, logical and physical in order to meet the data requirements of a community of users. The data requirements must be transformed into data semantics and implemented into a chosen data model. In this lecture, our data model is relational which means that all relevant data are related to each other to implement a computer application. As the users' data requirements change, and so will be the database design. As a result, Change is a norm in database design (data modeling).

Overall Business requirements:

We need to keep records about the boat people of their name, birth-place, birth-date, date-of-arrival, and skills, refugee status, and resettlement countries.

Screening procedure classifies the boat people Refugee or non-refugee.

A refugee can be divided into accepted refugee and waiting-refugee.

Some refugees classified as professionals of Engineer, Accountant or Doctor are accepted-refugee.

A detention center keeps many non-refugees and an open center keeps many waiting-refugees.

A departure centre keeps many accepted-refugees and a country settles many accepted refugees.

(a) User requirements:

We need to keep records about the boat people of their name, birth-place, birth-date, date-of-arrival, and skills, refugee status, and resettlement countries. Screening procedure classifies the boat people Refugee or non-refugee.

Data Semantics:

Refugee isa Boat People. Non-Refugee isa Boat People. Refugee and Non-Refugee can be disjoint generalized as Boat People.

EER model:

(b) User requirements:

A refugee can be divided into accepted refugee and waiting-refugee.

Data semantics:

An accepted refugee isa Refugee. A Waiting-Refugee isa Refugee. An Accepted Refugee and a Waiting-Refugee can be disjoint generalized as Refugee.

EER model:

(c) User requirements:

Some refugees classified as professionals of Engineer, Accountant or Doctor are accepted-refugee.

Data Semantics:

Refugee isa Professional. A Professional isa Engineer or a Professional isa Accountant or a Professional isa Doctor such that a Professional can be classified as either an Engineer or an Accountant or a Doctor.

EER model:

(d) User requirements:

A detention center keeps many non-refugees and an open center keeps many waiting-refugees.

Data Semantics:

A Detention Center is in one-to-many cardinality with Non-Refugee. An Open Center is in one-to-many cardinality with Waiting-Regufees.

EER model:

(e) User requirements:

A departure centre keeps many accepted-refugees and a country settles many accepted refugees.

Data semantics:

A Departure Centre is in one-to-many cardinality with Accepted-Refugees. A Country is in one-to-many cardinality with Accepted Refugees.

EER model:

As a result, we can integrated the above derived EER model into an EER model as the data modeling for the overall business requirements:

Map the above EER model into a relational schema:

l Map entity into relation.

l Map entity attribute and key into relation attribute and primary key.

l Map relationship into relation foreign key.

l Map weak entity into relation composite key.

l Map partial participation into nullable foreign key.

l Map m:n/n-ary relationship into relationship relation.

l Enforce other data semantic constraints of EER model by application programming.

Relation Boat_Person (Name, Birth_date, Birth_place)

Relation Refugee (*Name)

Relation Non_refugee (*Name, Status, *Center_name, Detain_date)

Relation Detention_center (Center_name)

Relation Waiting_refugee (*Name, *Open_center_name, Reside_date)

Relation Accepted_refugee (*Name, Resettle_date, *Country_name,

*Departure_center_name, Stay_date)

Relation Departure_center (Center_name)

Relation Country (Country_name)

Relation Professional (*Name)

Relation Engineer (Name)

Relation Accountant (Name)

Relation Doctor (Name)

Relation Open_Center (Open_center_name)

Where underlined are primary key, and prefixed with “*” are foreign key

ID: Refugee.Name Í Boat_Person.Name

ID: Non_Refugee.Name Í Boat_Person.Name

ID: Waiting_Refugee.Name Í Refugee.Name

ID: Accepted_Refugee.Name Í Professional.Name

ID: Professional.Name Í Engineer.Name

ID: Professional.Name Í Accountant.Name

ID: Professional.Name Í Doctor.Name

Tutorial Question

Design an Extended Entity Relationship Model for a Hospital database system with the following Application requirements:

One patient may have many insurance covers.

AE Record, Ward Record and Outpatient Record can all be a Medical Record, but data of the three kinds of records are mutually exclusive.

A Patient can have Insurance Cover his/her Medical Record expenses.

A Patient can have many patient Record Folders.

Many patient record Borrowers can borrow many patient Record Folders.

A Borrower can be a Department, a Doctor or Other Hospital.

For each borrow of patient record, an entry of Loan History is logged.

A Record Folder has many Medical Records.

Description of the entities in the Hospital database system with the following data requirements in parenthsis ( ):

A Patient has a patient name(Patient_name) and is identified by (HKID).

An Insurance Cover is identified by insurance number(Insurance_no).

A Medical Record has Create date(Create_date) and Sub type(Sub_type), and is identified by Medical Record Number(Medical_rec_no).

An AE Record has a AE number(AE_no) and is identified by Medical Record Number(Medical_rec_no).

A Ward Record has a Ward number(Ward_no), Admission Date(Admission_date), Discharge Date(Discharge_date) and is identified by Medical Record Number(Medical_rec_no).

An Outpatient record has a OPD Number(OPD_no), Specialty(Specialty) and is identified by Medical Record Number(Medical_Rec_no)

A patient Record Folder has a location(Location) and is identified by Folder Number(Folder_no).

A Department has a Department Name(Department_name), and is identified by Borrower(Borrower_no).

A Doctor has a Doctor Name(Doctor_name), and is identified by borrow number(Borrower_no).

An Other Hospital has a hospital name(Hospital_name) and is identified by Borrower number(Borrower_no).

A Loan History is identified by Loan Date(Loan_date), Borrower Number(Borrower_no) and Folder number(Folder_no).

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!