The Information Technology Blog brought to you by 4 Ace Technologies

Monday, March 23, 2009

Entity Relation Diagram - Tutorial (Getahead)

Extracted from: http://www.getahead-direct.com/gwentrel.htm

The information in this free Entity Relationship Diagrams tutorial is taken from "GetAhead - Entity Relationship Diagrams".
It explains the entire process of drawing entity relationship diagrams, including worked examples and exercises.

Click here to see details of this course.

Entity Relationship Diagram - Introduction
data modeling - example data model
Entity relationship diagramming is a technique that is widely used in the world of business and information technology to show how information is, or should be, stored and used within a business system.

The success of any organization relies on the efficient flow and processing of information.

In this example information flows around the various departments within the organization. This information can take many forms, for example it could be written, oral or electronic.

Here is an example of the sort of information flows that you might be analyzing:

The general manager regularly communicates with staff in the sales and marketing and accounts departments by using e-mail. Orders received by sales and marketing are forwarded to the production and accounts departments, for fulfillment and invoicing. The accounts department forward regular written reports to the general manager, they also raise invoices and send these to the customers.

Data modeling is a technique aimed at optimizing the way that information is stored and used within an organization. It begins with the identification of the main data groups, for example the invoice, and continues by defining the detailed content of each of these groups. This results in structured definitions for all of the information that is stored and used within a given system.

The technique provides a solid foundation for systems design and a universal standard for system documentation. Data modeling is an essential precursor to analysis & design, maintenance & documentation and improving the performance of an existing system.


Entity Relationship Diagram - Diagram Notation
Entity relationship diagramming uses a standard set of symbols to represent each of these defined data groups and then proceeds by establishing the relationships between them. The first of these symbols is the soft-box entity symbol. data modeling - entity soft box
An entity is something about which data will be stored within the system under consideration. In this example the data group invoice can be identified as a system entity.

The other main component on a data model is the relationship line. A Relationship is an association between two entities to which all of the occurrences of those entities must conform.

data modeling - example relationship
The relationship is represented by a line that joins the two entities, to which it refers. This line represents two reciprocal relationships:That of the first entity with respect to the second, and that of the second entity with respect to the first.

Entity relationship diagramming is all about identifying entities and their relationships and then drawing a diagram that accurately depicts the system. This applies equally to the design of a new system or the analysis of an existing one.

The end result of
entity relationship diagramming should be a clear picture of how information is stored and related within a proposed, or existing, system.

Entity Relationship Diagram - Entities
Here, we illustrate the concept of an entity, which can be applied to almost anything that is significant to the system being studied. Some examples of information systems and their entities are listed below:

Banking system: Customer, Account, Loan.

Airline system: Aircraft, Passenger, Flight, Airport.

An entity is represented by a box containing the name of that entity.

A precise definition of ‘entity’ is not really possible, as they even vary in nature. For example, in the airline system, whilst an aircraft is a physical object (entities often are) a flight is an event and an airport is a location. However entities are nearly always those things about which data will be stored within the system under investigation.

Note that entities are always named in the singular; for example: customer, account and loan, and not customers, accounts and loans.

This course uses symbols that are standard in the IT industry. This uses the soft-box symbol shown to represent an entity. If a site uses a different symbol set, this is not a problem, as
entity relationship diagramming techniques are the same regardless of the symbols being used.

Entity Relationship Diagram - Entity Types & Occurrence
Similar entity occurrences are grouped together and collectively termed an entity type. It is entity types that are identified and drawn on the data model.

An entity occurrence identifies a specific resource, event, location, notion or (more typically) physical object.

In this course the term 'entity' is, by default, referring to entity type. The term entity occurrence will be specifically used where that is relevant.

Each entity has a data group associated with it. The elements of the data group are referred to as the 'attributes ' of the entity. The distinction between what is an attribute of an entity and what is an entity in its own right is often unclear. This is illustrated shortly.

Entity Relationship Diagram - Entity Naming
Entity names are normally single words and the name chosen should be one familiar to the users. The entity name can include a qualifier in order to clarify their meaning. However, if different names are currently used to describe a given entity in different areas of the organization then a new one should be chosen that is original, unique and meaningful to all of the users.

For example, the terms 'signed contract', 'sale' and 'agreement' might be recreated as the entity 'completed'.

Conversely an organization may be using a 'catch all' term to describe what the analyst identifies as being a number of separate entities. For example the term 'invoice' may be being used to describe 3 invoice types - each of which is, in fact, processed in a different manner.

In this case prefixing the entity names with qualifiers, is likely to be the best solution.
Entity Relationship Diagram - Entity Identification
The process of identifying entities is one of the most important steps in developing a data model.

It is common practice for an experienced analyst to adopt an intuitive approach to entity identification, in order to produce a shortlist of potential entities. The viability of each of these potential entities can then be considered using a set of entity identification guidelines. This should result in some of the potential entities being confirmed as entities, whilst others will be rejected.

In this exercise you will be asked to identify a set of potential entities within a simple business scenario. This should help you to understand and appreciate the entity identification guidelines better.

Read the following case study. Study this information carefully and see if you can identify the entities - remember that entities are those things about which data will be stored.

Make your own list of those things that you think are likely to be entities, before moving to the next screen.

Entity Relationship Diagram - Entity Identification Case Study
City Cameras is an independent retailer of cameras, video-cameras and accessories. The owner fulfils the roles of shopkeeper and manager and he purchases a variety of products from a number of different suppliers.

The owner can check on different suppliers wholesale and recommended retail prices with reference to their price lists, as shown.

During a normal day several customers will enter the shop and a number of them will buy one or more of the products on sale.

At some stage the owner may decide that one or more product lines need to be re-ordered, following a visual stock-take. He will then consult the latest suppliers price lists to see who is offering the best deals on given product lines.

Following this, he will ring one or more of the suppliers to order some of their products. At the same time he will also make a written record of the orders that have been placed with each supplier on a separate sheet of paper. These records are then used to verify incoming orders and invoicing details.
Entity Relationship Diagram - Entity Identification – Exercise#1
With reference to the case study information, make a list of all of those things mentioned in the case study that could be entities - that is the potential entities.

Your list should look something like that shown below:

Suppliers Price List, Customer, Product, Order, Invoicing Details & Supplier

There are six potential entities listed. From this initial list we will consider the 'suppliers price list' to be a likely attribute of the entity 'supplier'. Therefore we shall consider this within the context of the supplier entity. The 'invoicing details' are stated to be attributes of the 'order record' entity, so we shall also discount this as a potential entity at this stage.

Remember that entities are described in the singular as they relate to entity types. 'Customer' for example represents the entity type 'customer' which encompasses an infinite number of 'customer' entity occurrences.

Taking these four as our list of potential entities, each will be discussed in turn:

Entity Relationship Diagram - Entity Identification – Exercise#2
In many business systems, information about the customer is of great importance. An insurance company or bank, for example, could not function without a customer database on which comprehensive personal details are stored. This customer database also serves as an essential resource for selling new financial products and services.

But how much customer information is likely to be stored by City Cameras?

Are they even going to record the name & address of their customers?

Interviews with the owner reveal the answer to be that he has no real interest in storing 'information' about his customers. He only records their details onto any necessary warranty documents and then sends these off to the appropriate supplier.

Therefore, in the context of this system customer is NOT an entity.

Entity Relationship Diagram - Entity Identification – Exercise#3
It is a natural assumption that all retail businesses would hold a significant amount of product information. However in this study the only level of product information is that which is held on the suppliers' price lists.

Lets look again at the suppliers price list in the case study. This confirms that product information is held within this system and it is apparent from the case study that products are of real interest.

So have we identified an entity?

At this stage it would be likely that product would be considered to be an entity. However, you will shortly see why the analysis phase needs to be iterative - enabling decisions to be altered later, if necessary.
Entity Relationship Diagram - Entity Identification – Exercise#4

data modeling - suppliers price list
Once again a natural assumption would be that a retail business would store substantial information about its' suppliers.

On requesting to see information about City Cameras' suppliers, the owner once again reaches for the suppliers' price lists.

Lets look again at the suppliers' price list in the case study. Each of these lists has the name, address and telephone number of the supplier on the first page. The suppliers' price list is the only place where City Cameras stores information about suppliers.

Whilst the early investigation indicated that 'product' was probably an entity, it now becomes apparent that the unique identification of a product and access to the product information is also only possible after locating the relevant suppliers price list.

It has now been established that all of the information that is stored in relation to the two potential entities 'product' and 'supplier' are held in the same place - the suppliers' price list. This means that the suppliers' price list is an entity and that both product and supplier represent information held within this entity.

Both supplier and product are therefore identified as being attributes of the entity 'suppliers price list'.

Entity Relationship Diagram - Entity Identification – Exercise#5
What about the potential entity: 'Order'. Investigation reveals that the re-ordering process consists of visual stocktaking on an ad-hoc basis, followed by mental recall of those suppliers that stock the identified products.

The appropriate suppliers price lists are then referred to for the up-to-date pricing information and contact details and the order is placed over the telephone. The owner keeps a written record of the orders he places, each order on a separate sheet of paper, and these are then filed. Let's look again at the record of an order, as shown in the case study.

This written order record is used to check against incoming products, to verify invoicing details and to chase orders that may be overdue. The 'order' is held as stored information and therefore 'order' does represent an entity.

Entity Relationship Diagram - Entity Identification – Exercise#6
Having started with six potential entities (suppliers price list, customer, product, order, invoicing details and supplier), the analysis has identified that only two of these are in fact entities.

We eliminated customer, as no customer information is recorded or stored within this retail outlet.

The stored information relating to both a product and a supplier was found to only exist within the suppliers' price list. Therefore Suppliers' Price List was identified as being the only entity amongst these three.

Order was confirmed as a system entity and the invoicing details were identified early on as being an attribute of this entity.

Even in this simple scenario it should be apparent that entity identification needs careful consideration. Interestingly, both of the entities that were identified existed as documents within the system. Entities are often synonymous with discrete information stores within a system - whether physical or electronic.

The precise definition of what is an entity and what is an attribute will not always be clear. Therefore the process of entity identification should be iterative, enabling the review of decisions made earlier. Remember, entity types are always named in the singular and this name then represents all of the occurrences of that entity type.
Entity Relationship Diagram - Entity Identification Guidelines
There are a variety of methods that can be employed when trying to identify system entities. There follows a series of entity identification guidelines, which should prove helpful to the inexperienced analyst:

An informal questioning approach can be adopted, in which the analyst asks targeted questions to determine what information is necessary and whether or not that information is recorded within the system.

During face to face discussions with users the nouns (or given names of objects) should be recorded - as these often indicate those things that are entities within a system.

The existing documentation often contains clues as to the information that needs to be held and once again the nouns in the text may indicate potential entities.

Every fact that is required to support the business is almost certainly an attribute (or data item). In turn each of these attributes will belong to an entity. If no 'parent' entity can be found for one or more of these low level facts, then this indicates that your entity search is incomplete.

However, don't get hung up on the initial analysis. Entity identification can continue once the drawing of the data model diagram has begun. As this diagram is developed and refined further entities may become apparent.
Entity Relationship Diagram - Attributes
Many different occurrences of a given entity type can usually be identified. In the gift shop example both of the entities 'order' and 'suppliers price list' had numerous occurrences.

Each entity type can always be described in terms of attributes, and these attributes will apply to all occurrences of that given entity type. In the camera shop example, all occurrences of the entity 'supplier' could be described by an identifiable set of attributes, including:

The Supplier Name, the Supplier Address, Telephone Number, etcetera.

A given attribute belonging to a given entity occurrence can only have one value. Therefore, if a supplier could have more than one address or telephone number then this should be determined before defining the attributes of that entity type.

In this example the defined entity may require two or three address and/or telephone number attributes. It is the maximum practical instances of a given attribute that should be catered for in the entity type definition.
Entity Relationship Diagram - Entity Keys
An entity is defined by its attributes. Furthermore, each entity occurrence can be uniquely identified, by using an attribute or a combination of attributes as a key.

The primary key is the attribute (or group of attributes) that serve to uniquely identify each entity occurrence. Consider the problem that might arise if the name and address of an individual were used as the primary key for identifying the patients within a hospital.

Take the example of a patient called David Smith living at 23 Acacia Avenue. He has a son also called David Smith living at the same address.

Name and Address would not necessarily provide a unique identifier and confusion could easily arise, potentially creating a mix up with the patient records.

For this reason, in a hospital system patients each have a Patient Number as their primary key.

If two or more data items are used as the unique identifier, then this represents a compound key. For example, a compound key used to identify a book could be 'Title' together with 'Author'.

There may be occasions of authors using a previously used title but not of an author using the same title for more than one of their own books.

Where several possible primary keys exist they are called candidate keys.

For example, a book could be identified, either by 'Title' together with 'Author' or by the widely used unique identifier for books - the ISBN number.

Where an attribute of one entity is a candidate key for another entity, it is termed a foreign key.

For example, the attribute 'Author' belonging to the entity Book is a foreign key within the entity Author. You may be able to think of some shortcomings to the use of this attribute as the primary key, for example two authors having the same name.

It is worth noting that entity relationships are often indicated by the presence of foreign keys.

Entity Relationship Diagram - Relationships

data modeling - example

The relationship is the association between two entities to which all of the occurrences of those entities must conform. The diagram shown represents the beginnings of a data model where the relationship between a manager and a department needs to be defined.

The entities on data models are linked by relationship lines and together these are the only two components that make up a data model diagram. A relationship is an association between two entities to which all of the occurrences of those entities must conform.

Every relationship line shows two reciprocal relationships:
That of the first entity with respect to the second and that of the second entity with respect to the first. In this example a manager is responsible for a department and a department is the responsibility of a manager.

Each relationship line has three distinct properties: Firstly the relationship link phrase, secondly the degree or cardinality of the relationship and thirdly the participation or optionality of the relationship. These three properties combine to form the relationship statement.

Entity Relationship Diagram - Relationship Link Phrase

data modeling - example

The first property of the relationship statement is the relationship link phrase. This should be a short description of the nature of the relationship, typically between three and five words long. It is always read clockwise with respect to the entities that it links, so in this example: 'Manager is responsible for department', and 'Department is responsibility of manager'.

If the same relationship were to be drawn with department on the left hand side then the positions of the link phrases would have to be reversed.

Entity Relationship Diagram - Relationship Cardinality
The second property of the relationship statement is the degree, or maximum cardinality, of the relationship. If an entity has a crowsfoot symbol drawn against it, then many occurrences of that entity may relate to the other entity. Conversely if no crowsfoot is drawn against it, at most one occurrence of that entity may relate to the other entity.

data modeling - example
In this example: Each company employs one or more employees, but Each employee is employed by only one company. This is called a one-to-many relationship. Maximum cardinalities may be combined to give another two relationship types, In this example:
data modeling - example
Each manager is responsible for only one department and each department is the responsibility of only one manager. This is called a one-to-one relationship.

And in this example:
data modeling - example
Each lecturer teaches one or more courses and each course is taught by one or more lecturers. This is called a many-to-many relationship.

To recap, three different relationship types have been illustrated, one-to-many, one-to-one and many-to-many.

Entity Relationship Diagram - Relationship Participation
The third and final property of the relationship statement is the participation or optionality. A solid line shows that an entity occurrence must be associated with each occurrence of the other entity. In this example:
data modeling - example
Each passenger must possess a ticket, and Each ticket must belong to a passenger. A dotted line shows that an entity occurrence may be associated with each occurrence of the other entity, In this example:

data modeling - example
Each book may be borrowed by a borrower, and Each borrower may borrow one or more books. Furthermore, these symbols can be combined. In this example:
data modeling -example
Each book may be recalled by a reservation, but Each reservation must be recalling a book.

Remember, there are only two components to a data model diagram, entities and relationships. A relationship is an association between two entities to which all of the occurrences of those two entities must conform.

There are three distinct properties of the relationship; firstly the relationship link phrase, secondly the degree or cardinality of the relationship and thirdly the participation or optionality of the relationship. These three properties are collectively termed the relationship statement.

Entity Relationship Diagram - Identifying Relationships
There are just two questions that need to be asked, in order to establish the degree of the relationship that exists between any two entities.

In order to identify the degree of the relationship between the entities X and Y the following two questions need to be asked.

Question 1
Can an occurrence of X to be associated with more than one occurrence of Y?

Question 2
Can an occurrence of Y to be associated with more than one occurrence of X?

Each of these questions can be answered 'Yes' or 'No' and both questions must be answered. This means that there are four possible outcomes as shown in the table.
data modeling - decision grid

The nature of the relationship associated with each outcome is as follows:

Option 1, Question1 equals Yes, Question2 equals No.
In this case a one-to-many relationship has been identified, represented by the relationship line shown.

Option 2, Question1 equals No, Question2 equals Yes
As in the first case a one-to-many relationship has been identified, represented by the relationship line shown.

Option 3, Question1 equals Yes, Question2 equals Yes
In this case a many-to-many relationship has been identified.

Many-to-many relationships may be shown in the early 'preliminary' data model in order to aid the clarity of these early diagrams. However, such relationships are invalid and are therefore always re-modeled using 'link entities' in later diagrams. This process is explained later in the course.

Option 4, Question1 equals No, Question2 equals No
In this case a one-to-one link has been identified.

Legitimate one-to-one relationships are rare and it is likely that this relationship is one that needs to be rationalized. The methods used to investigate one-to-one relationships and to re-model them where necessary are explained later in the course.

In a one-to-many relationship the entity at the 'one' end is normally referred to as the master, and the entity at the 'many' end referred to as the detail entity. Some analysts adopt the 'no dead crows' rule and avoid drawing crowsfeet pointing upwards. This ensures that detail entities are shown below the master entities to which they belong.

This makes the diagram clearer, although congestion may make this rule difficult to enforce throughout the data model.

Entity Relationship Diagram - Relationship Statements
The relationship statement is a formal description that encompasses the three properties of the relationship.

The relationship statement encompasses the three properties of the relationship. The first property is the relationship link phrase, the second the degree or cardinality of the relationship and the third the participation or optionality of the relationship.

No comments:

Post a Comment