The Information Technology Blog brought to you by 4 Ace Technologies

Monday, March 23, 2009

Training Session - PHP Developer

We welcome you to 4 Ace Technologies and hope you will learn, grow, and give quality output on our projects as an individual/team member at our organization.

To get you started and make productive, we require you to read or go through the following material as thoroughly as possible as it will be very handful for you while working on projects. Also, you can refer to our blog at http://4acetech.blogspot.com from time to time to keep yourself updated with recommended reading that we prefer our employees to study.

Our first training session consists of the following sections that you'll go through:
1) SVN
2) PHP
3) AJAX
4) SQL

If you need any help or have any queries, you may contact your supervisor/teamleader.

SVN:
PHP:
AJAX:
SQL:


4 Ace Technologies

AJAX Tutorial (from w3schools)

You can refer to the following link to learn AJAX basics:

http://www.w3schools.com/Ajax/Default.Asp

4 Ace Technologies

PHP Tutorial (from w3schools)

Please go through all the sections related to PHP (especially database and AJAX sections) in the tutorial of w3schools, that can be reached at the following link:

http://www.w3schools.com/PHP/

You have to



4 Ace Technologies

SQL Introduction (from w3schools)

You are required to go through the SQL Basic and SQL Advanced tutorial from w3schools, which can be reached at the following URL:

http://www.w3schools.com/sql/sql_intro.asp


4 Ace Technologies

MySql Optimization / Tuning (Case Study)

Extracted from:
http://dev.mysql.com/tech-resources/articles/mysql-db-design-ch18.pdf

Recommended reading from the above link, if doesn't work, refer below:

Case Study: High-Hat Delivers!

To augment his already-stellar decision-making skills, High-Hat Airways’ CEO employs a
diverse squadron of soothsayers, astrologists, and fortune tellers. For months, these experts
have been pushing him to enter the lucrative package shipment marketplace. The CEO has
been reluctant; he can’t exactly explain why, but he’s been waiting for a signal. Finally, the
sign arrives in the form of a dream: rows of singing cardboard boxes, each stuffed with cash.
The next morning, the CEO summons his executive team to an emergency meeting to
deliver the great news: High-Hat is entering the shipping business!
With lightning speed, High-Hat’s IT department springs into action. Impending layoffs are
delayed. Vacations are canceled; several projects nearing completion are put on hold as the
entire team works around the clock to realize the CEO’s dream.
This Herculean effort pays off: An entire package tracking application infrastructure is built
and deployed worldwide in a matter of weeks. Of course, QA is a bit behind; performance
testing isn’t even a consideration.
Initially, the new package tracking service is a hit. Wall Street raises earnings estimates, and
many executives receive bonuses. The previously delayed IT layoffs now proceed, adding
even more to the bottom line.
From a systems perspective, everything appears fine. The new applications have relatively
few issues. Results are accurate, and response time is reasonable. During the course of the
first month, however, things begin to change. Mysterious reports, detailing sporadic yet horrendous
application performance problems, start arriving daily from High-Hat’s far-flung
empire. Rumors of severe problems leak out to financial analysts, who promptly cut earnings
estimates, thereby decimating the stock price. Several executives are demoted, while numerous
midlevel managers are moved from their offices into tiny cubicles.
A desperate CIO calls you late one night. High-Hat is very sorry about laying you off, and
wants you to return to help overcome these problems. After renegotiating your compensation
package, you’re ready to go back to work.
360 CHAPTER 18 Case Study: High-Hat Delivers!
Being an astute performance-tuning expert, you know that the first task is to accurately catalog
the problems. Only then can you proceed with corrections. Your initial analysis separates
the main problems into the following high-level categories.
Problem Queries
After looking into the query situation, you realize that, basically, two types of problem
queries exist. The first is encountered when a user tries to look up the status of a shipment.
However, it is not consistent: It appears to happen sporadically for most users. The second
problem query happens to everyone whenever they attempt to accept a new package for
shipment.
Package Status Lookup
Internal employees and external website users have begun complaining that it takes too long
to look up the shipping status for a package. What makes this more perplexing is that it
doesn’t happen all the time. Some queries run very fast, whereas others can take minutes to
complete.
The principal tables for tracking package status include the following:
CREATE TABLE package_header (
package_id INTEGER PRIMARY KEY AUTO_INCREMENT,
dropoff_location_id SMALLINT(3),
destination_location_id SMALLINT(3),
sender_first_name VARCHAR(20),
sender_last_name VARCHAR(30),
...
recipient_first_name VARCHAR(20),
recipient_last_name VARCHAR(30),
...
recipient_fax VARCHAR(30),
...
INDEX (sender_last_name, sender_first_name),
INDEX (recipient_last_name, recipient_first_name),
INDEX (recipient_fax)
) ENGINE = INNODB;
CREATE TABLE package_status (
package_status_id INTEGER PRIMARY KEY AUTO_INCREMENT,
package_id INTEGER NOT NULL REFERENCES package_header(package_id),
...
...
package_location_id SMALLINT(3) NOT NULL,
Problem Queries 361
activity_timestamp DATETIME NOT NULL,
comments TEXT,
INDEX (package_id)
) ENGINE = INNODB;
Diagnosis
As an experienced MySQL expert, you know that MySQL offers a number of valuable tools
to help spot performance problems. One of them is the slow query log, as discussed in
Chapter 2, “Performance Monitoring Options.” By simply enabling this log, you can sit
back and wait for the troubled queries to make their presence known.
Sure enough, after a few minutes you see some candidates:
# Time: 060306 17:26:18
# User@Host: [fpembleton] @ localhost []
# Query_time: 6 Lock_time: 0 Rows_sent: 12 Rows_examined: 573992012
SELECT ph.*, ps.* FROM package_header ph, package_status ps WHERE
ph.package_id = ps.package_id AND ph.recipient_fax like ‘%431-5979%’;
# Time: 060306 17:26:19
# User@Host: [wburroughs] @ localhost []
# Query_time: 9 Lock_time: 0 Rows_sent: 0 Rows_examined: 5739922331
SELECT ph.*, ps.* FROM package_header ph, package_status ps WHERE
ph.package_id = ps.package_id AND ph.recipient_fax like ‘%785-4551%’;
# Time: 060306 17:26:21
# User@Host: [nikkis] @ localhost []
# Query_time: 9 Lock_time: 0 Rows_sent: 0 Rows_examined: 5739922366
SELECT ph.*, ps.* FROM package_header ph, package_status ps WHERE
ph.package_id = ps.package_id AND ph.recipient_fax like ‘%341-1142%’;
Now that you’ve found what appears to be a problem query, your next step is to run EXPLAIN
to see what steps the MySQL optimizer is following to obtain results:
mysql> EXPLAIN
-> SELECT ph.*, ps.*
-> FROM package_header ph, package_status ps
-> WHERE ph.package_id = ps.package_id
-> AND ph.recipient_fax like ‘%431-5979%’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ph
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 521750321
362 CHAPTER 18 Case Study: High-Hat Delivers!
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: ps
type: ref
possible_keys: package_id
key: package_id
key_len: 4
ref: high_hat.ph.package_id
rows: 1
Extra:
2 rows in set (0.00 sec)
This output provides the answer: MySQL is performing an expensive table scan on
package_header every time a user searches on recipient fax. Considering the sheer size of
the table, it’s apparent that this leads to very lengthy queries. It also explains the sporadic
nature of the query problem: Most status queries use some other lookup criteria.
When you interview the developer of the query, you learn that this query exists to serve customers,
who might not always know the area code for the recipient fax. To make the query
more convenient, the developer allowed users to just provide a phone number, and he places
a wildcard before and after the number to find all possible matches. He’s aghast to learn that
this type of query frequently renders existing indexes useless.
Solution
When faced with a large-table query that is not correctly taking advantage of indexes, you
have two very different options: Fix the query or add a new index. In this case, it’s probably
easiest and wisest to just correct the query. The application logic should force the user to
enter an area code and fax number. In combination, these two values will be able to employ
the index:
mysql> EXPLAIN
-> SELECT ph.*, ps.*
-> FROM package_header ph, package_status ps
-> WHERE ph.package_id = ps.package_id
-> AND ph.recipient_fax like ‘516-431-5979’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ph
type: range
possible_keys: PRIMARY,recipient_fax
key: recipient_fax
key_len: 30
ref: NULL
rows: 1
Problem Queries 363
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: ps
type: ref
possible_keys: package_id
key: package_id
key_len: 4
ref: high_hat.ph.package_id
rows: 1
Extra:
2 rows in set (0.00 sec)
As you saw earlier during the review of MySQL’s optimizer in Chapter 6, “Understanding
the MySQL Optimizer,” version 5.0 offers better index utilization. In this case, the developer
might elect to allow the user to query on several area codes. The new optimizer capabilities
mean that MySQL can still take advantage of the index:
mysql> EXPLAIN
-> SELECT ph.*, ps.*
-> FROM package_header ph, package_status ps
-> WHERE ph.package_id = ps.package_id
-> AND ((ph.recipient_fax like ‘516-431-5979’)
-> OR (ph.recipient_fax like ‘212-431-5979’))\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ph
type: range
possible_keys: PRIMARY,recipient_fax
key: recipient_fax
key_len: 30
ref: NULL
rows: 2
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: ps
type: ref
possible_keys: package_id
key: package_id
key_len: 4
ref: high_hat.ph.package_id
rows: 1
Extra:
2 rows in set (0.00 sec)
364 CHAPTER 18 Case Study: High-Hat Delivers!
Shipping Option Lookup
To wring more profit from its shipping service, High-Hat implemented a complex pricing
mechanism, with thousands of possible prices based on weight, distance, potential value of
the customer, currency, language, and so on. All of this information is stored in a single, vital
lookup table:
CREATE TABLE shipping_prices (
price_id INTEGER PRIMARY KEY AUTO_INCREMENT,
price_code CHAR(17) NOT NULL,
from_zone SMALLINT(3) NOT NULL,
to_zone SMALLINT(3) NOT NULL,
min_weight DECIMAL(6,2) NOT NULL,
max_weight DECIMAL(6,2) NOT NULL,
...
price_in_usd decimal(5,2) NOT NULL,
price_in_euro decimal(5,2) NOT NULL,
price_in_gbp decimal(5,2) NOT NULL,
...
price_in_zambia_kwacha DECIMAL(15,2) NOT NULL,
price_rules_in_english LONGTEXT NOT NULL,
price_rules_in_spanish LONGTEXT NOT NULL,
...
price_rules_in_tagalog LONGTEXT NOT NULL,
price_rules_in_turkish LONGTEXT NOT NULL,
...
INDEX (price_code),
INDEX (from_zone),
INDEX (to_zone),
INDEX (min_weight),
INDEX (max_weight)
) ENGINE = MYISAM;
Users are complaining that it takes too long to look up the potential price to ship a package.
In several cases, customers have either hung up on the High-Hat sales representative or
even stormed out of the package drop-off centers.
Diagnosis
Given how frequently this data is accessed by users, it seems that it should be resident in
memory most of the time. However, this is not what your analysis shows.
The first thing you check is the size of the table and its indexes. You’re surprised to see that
this table has hundreds of thousands of very large rows, which consumes enormous amounts
of space and makes full memory-based caching unlikely.
The next observation that you make is that this is a heavily denormalized table. This means
that when a High-Hat representative retrieves the necessary rows to quote a price to a
Problem Queries 365
customer in France, each row that she accesses contains vastly larger amounts of information
(such as the price in all currencies and shipping rules in all languages), even though this data
is irrelevant in her circumstance.
Finally, you examine the query cache to see how many queries and results are being buffered
in memory. You’re disappointed to see that the query cache hit rate is very low. However,
this makes sense: Recall that if two queries differ in any way, they cannot leverage the query
cache.
Solution
The underlying problem here is that the database design is horribly inefficient: Had the
designers done a better job of normalization, there would be reduced memory requirements
for the essential lookup columns; extraneous columns would not even be included in most
result sets. Alas, a database redesign is out of the question, so your next course of action is to
make the best of a bad situation and help MySQL do a better job of caching information
given the dreadful database design.
Often, the least aggravating and time-consuming approach to raising cache performance is
to simply plug more memory into your database server. However, in this case, the server has
no more storage capacity, so you need to come up with an alternative strategy. The only
remaining choice is to focus on MySQL configuration.
You have several choices when deciding how to cache heavily accessed tables containing critical
lookup information that is infrequently updated.
n Switch to a MEMORY table—These fast, RAM-based tables were explored in Chapter 4,
“Designing for Speed,” overview of MySQL’s storage engines. If there was sufficient
RAM, you could theoretically load the entire shipping_prices table into memory.
However, there isn’t enough storage, so this option is not workable.
n Increase utilization of the key cache—As you saw in Chapter 11, “MyISAM
Performance Enhancement,” the MyISAM key cache leverages memory to hold index
values, thereby reducing costly disk access. However, memory is already a precious
commodity on this server, so it’s unlikely that you’ll be able to extract some additional
RAM from your administrators. In addition, this isn’t an index problem; instead, the
fault lies with the sheer amount of data in each row.
n Make better use of the query cache—As you have seen, the query cache buffers frequently
used queries and result sets. However, there are several important requirements
before a query can extract results from this buffer. One crucial prerequisite is that a new
query must exactly match already-cached queries and result sets. If the new query does
not match, the query cache will not be consulted to return results.
In this case, you know from your analysis that there is a high degree of variability
among queries and result sets, which means that even the largest query cache won’t
help.
366 CHAPTER 18 Case Study: High-Hat Delivers!
n Employ replication—Recall from earlier in this chapter, the replication discussion that
significant performance benefits often accrue by simply spreading the processing load
among multiple machines. In this case, placing this fundamental lookup table on its
own dedicated machines is very wise. Because there are no other tables with which to
contend, it will have the lion’s share of memory, so caching hit rates should be somewhat
improved.
The application will then be pointed at this server to perform lookups, which should
require minimal code changes. However, given the mammoth amount of data found in
this table, it’s vital that the replicated server have sufficient memory and processor
speed to effectively serve its clients. Last but not least, these large rows have the potential
to crowd your network, so it’s important that the replicated server be placed on a
fast network with ample bandwidth. If not, there’s a real risk that you might trade one
performance problem for another.
Random Transaction Bottlenecks
You’ve saved the most difficult-to-pin-down performance obstacle for last. The IT help desk
receives high volumes of support calls at sporadic times throughout the day. During these
periods of intense activity, users complain of system response problems across the board.
These delays affect everything from saving new transaction records to updating existing
package status details to running reports. To make matters worse, there doesn’t appear to be
a correlation with user activity load: Some of the most severe slowdowns happen during offpeak
hours.
Diagnosis
Fortunately, when faced with such a fuzzy, hard-to-define problem, you have a wide variety
of tools at your disposal. These range from operating system monitors to network traffic
indicators to MySQL utilities. In circumstances in which there doesn’t appear to be a consistent
problem, it’s often best to arrive at a diagnosis by the process of elimination. You can
work through a list of possible causes of the transient performance issue:
n Insufficient hardware—If your server is underpowered, it’s likely that this deficiency is
most prominent during periods of peak activity. That isn’t the case here. To be certain,
it is wise to turn on server load tracking and then correlate that with MySQL response
issues.
n Network congestion—This is a little harder to rule out, but the performance problems
are not always happening during busy hours. Still, a slight possibility exists that
some user or process is hogging the network at seemingly random times, which incorrectly
gives the appearance of a MySQL problem. Matching a saved trace of network
activity with reports of performance problems goes a long way toward completely eliminating
this as a possible cause.
Random Transaction Bottlenecks 367
n Poor database design—Performance problems are the norm, rather than the exception,
if the database designers made key strategic errors when laying out the schema.
The same holds true for indexes: Generally, an inefficient index strategy is easy to identify
and correct.
n Badly designed queries—Given the broad constituency that is complaining about
these sporadic slowdowns, it seems unlikely that a single protracted query, or even a
group of sluggish queries, could be the culprit. The slow query log goes a long ways
toward definitively ruling this out.
n Unplanned user data access—The widespread availability of user-driven data access
tools has brought untold joys into the lives of many IT professionals. Nothing can drag
a database server down like poorly constructed, Cartesian product-generating unrestricted
queries written by untrained users.
Aside from examining the username or IP address of the offending client, it’s difficult to
quickly identify these types of query tools within the slow query log or active user list.
However, by asking around, you learn that a number of marketing analysts have been
given business intelligence software and unrestricted access to the production database
server.
Solution
Now that you’ve established that decision support users are likely the root cause of these
issues, you look at several alternatives at your disposal to reduce the impact of this class of
user. Think of the choices as the “Four R” strategy: replication, rollup, and resource restriction.
The following list looks at these choices in descending order of convenience.
n Replication—This is, by far, the most convenient solution to your problem. Dedicating
one or more slave servers is a great way to satisfy these hard-to-please users. There will
be some initial setup and testing, but no code or server settings need to be changed,
significantly minimizing the workload for developers and administrators.
n Rollup—Another way to diminish the negative performance impact of open-ended
reports is to aggregate and summarize information into rollup tables. This approach
requires no application or configuration changes, but it does necessitate some effort on
your part, and might not completely solve the resource contention issues. Moreover,
reporting users will be faced with a lag between live data and their view of this information.
n Resource restriction—MySQL offers a variety of mechanisms to constrain user
resource consumption. These options were discussed in Chapter 10, “General Server
Performance and Parameters Tuning,” exploration of general engine configuration.
They include max_queries_per_hour, max_updates_per_hour, max_connections_per_hour,
max_join_size, SQL_BIG_SELECTS, and max_tmp_tables.
368 CHAPTER 18 Case Study: High-Hat Delivers!
This is the least desirable approach. First, it likely requires configuring these settings at
either a global or session level. Second, there is a significant possibility that these
changes will prove ineffective in your particular environment.
Implementing These Solutions
Now that your first paycheck from High-Hat has cleared and you’ve decided how to address
these problems, what’s the safest course of action to implement your solutions? You
reviewed this topic in general during Chapter 1, “Setting Up an Optimization Test
Environment,” exploration of setting up a performance optimization environment.
In the context of the problems identified in this chapter, it’s wise to execute changes in the
following order:
1. Package status query correction—Correcting the application to force entry of an
area code when looking up packages by recipient fax number correctly employs the relevant
index and eliminates the costly table scans currently plaguing users. This is a lowrisk,
high-reward alteration.
2. Roll up tables for business intelligence query users—These users are wrecking performance
at unpredictable intervals. Because you don’t have the authority to lock them
out of the system, it’s a good idea to aggregate data for them in rollup tables. This is
lower risk, and requires less work than the next step, replication.
3. Replicate information for business intelligence query users—This is the ideal solution
for the problem of end-user query writers, but it does require some work on your
part (and introduces some minimal risk of “collateral damage”) to implement.
4. Replicate the shipping_prices table to a dedicated server—This change goes a long
way toward reducing resource contention on the primary server. Just like its counterpart
for business intelligence users, this type of activity comes with its own setup costs
and risks. In this case, you must change application logic to point at the right server,
which entails work as well as establishes some hazards if you miss a reference in your
software.

4 Ace Technologies

Entity-relationship model (from Wikipedia)

Extracted from: http://en.wikipedia.org/wiki/Entity-relationship_model

Entity-relationship model

From Wikipedia, the free encyclopedia

A sample ER diagram

An Entity-Relationship Model (ERM) in software engineering is an abstract and conceptual representation of data. Entity-relationship modeling is a relational schema database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion.

Diagrams created using this process are called entity-relationship diagrams, or ER diagrams or ERDs for short. The definitive reference for entity relationship modelling is generally given as Peter Chen's 1976 paper[1].

However, variants of the idea existed previously (see for example A.P.G. Brown[2]) and have been devised subsequently.

Contents

[hide]

Overview

The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain universe of discourse (i.e. area of interest). In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design".

There are a number of conventions for entity-relationship diagrams (ERDs). The classical notation is described in the remainder of this article, and mainly relates to conceptual modeling. There are a range of notations more typically employed in logical and physical database design, such as IDEF1X.

Connection

Two related entities
An entity with an attribute
A relationship with an attribute

An entity may be defined as a thing which is recognized as being capable of an independent existence and which can be uniquely identified. An entity is an abstraction from the complexities of some domain. When we speak of an entity we normally speak of some aspect of the real world which can be distinguished from other aspects of the real world.[3]

An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term.

Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem. Entities are represented as rectangles.

A relationship captures how two or more entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem. Relationships are represented as diamonds, connected by lines to each of the entities in the relationship.

Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute. Attributes are represented as ellipses connected to their owning entity sets by a line.

Every entity (unless it is a weak entity) must have a minimal set of uniquely identifying attributes, which is called the entity's primary key.

Entity-relationship diagrams don't show single entities or single instances of relations. Rather, they show entity sets and relationship sets. Example: a particular song is an entity. The collection of all songs in a database is an entity set. The eaten relationship between a child and her lunch is a single relationship. The set of all such child-lunch relationships in a database is a relationship set.

Lines are drawn between entity sets and the relationship sets they are involved in. If all entities in an entity set must participate in the relationship set, a thick or double line is drawn. This is called a participation constraint. If each entity of the entity set can participate in at most one relationship in the relationship set, an arrow is drawn from the entity set to the relationship set. This is called a key constraint. To indicate that each entity in the entity set is involved in exactly one relationship, a thick arrow is drawn.

Alternative diagramming conventions

Two related entities shown using Crow's Foot notation

Chen's notation for entity-relationship modelling uses rectangles to represent entities, and diamonds to represent relationships. This notation is appropriate because Chen's relationships are first-class objects: they can have attributes and relationships of their own.

Alternative conventions, with partly historical meaning are:

Crow's Foot

One alternative notation, known as "crow's foot" notation, was developed independently: in these diagrams, entities are represented by boxes, and relationships by labelled arcs.

The "Crow's Foot" notation represents relationships with connecting lines between entities, and pairs of symbols at the ends of those lines to represent the cardinality of the relationship. Crow's Foot notation is used in Barker's Notation and in methodologies such as SSADM and Information Engineering.

For a while Chen's notation was more popular in the United States, while Crow's Foot notation was used primarily in the UK, being used in the 1980s by the then-influential consultancy practice CACI. Many of the consultants at CACI (including Barker) subsequently moved to Oracle UK, where they developed the early versions of Oracle's CASE tools; this had the effect of introducing the notation to a wider audience, and it is now used in many tools including System Architect, Visio, PowerDesigner, Toad Data Modeler, DeZign for Databases, OmniGraffle, MySQL Workbench and Dia. Crow's foot notation has the following benefits:

  • Clarity in identifying the many, or child, side of the relationship, using the crow's foot.
  • Concise notation for identifying mandatory relationship, using a perpendicular bar, or an optional relation, using an open circle.
  • Shows a clear and concise notation that identifies all classes

ER diagramming tools

There are many ER diagramming tools. Some of the proprietary ER diagramming tools are Avolution, ConceptDraw, ER/Studio, ERwin, DeZign for Databases, MEGA International, OmniGraffle, Oracle Designer, PowerDesigner, Rational Rose, SmartDraw, Sparx Enterprise Architect, SQLyog, Toad Data Modeler, Microsoft Visio, and Visual Paradigm. A freeware ER tool that can generate database and application layer code (webservices) is the RISE Editor.

Some free software ER diagramming tools that can interpret and generate ER models, SQL and do database analysis are StarUML, MySQL Workbench, and SchemaSpy[5]. Some free software diagram tools which can't create ER diagrams but just draw the shapes without having any knowledge of what they mean or generating SQL are Kivio, Dia. Although DIA diagrams can be translated with tedia2sql.



4 Ace Technologies

Entity Relation Diagram - Tutorial (Getahead)

Extracted from: http://www.getahead-direct.com/gwentrel.htm

The information in this free Entity Relationship Diagrams tutorial is taken from "GetAhead - Entity Relationship Diagrams".
It explains the entire process of drawing entity relationship diagrams, including worked examples and exercises.

Click here to see details of this course.

Entity Relationship Diagram - Introduction
data modeling - example data model
Entity relationship diagramming is a technique that is widely used in the world of business and information technology to show how information is, or should be, stored and used within a business system.

The success of any organization relies on the efficient flow and processing of information.

In this example information flows around the various departments within the organization. This information can take many forms, for example it could be written, oral or electronic.

Here is an example of the sort of information flows that you might be analyzing:

The general manager regularly communicates with staff in the sales and marketing and accounts departments by using e-mail. Orders received by sales and marketing are forwarded to the production and accounts departments, for fulfillment and invoicing. The accounts department forward regular written reports to the general manager, they also raise invoices and send these to the customers.

Data modeling is a technique aimed at optimizing the way that information is stored and used within an organization. It begins with the identification of the main data groups, for example the invoice, and continues by defining the detailed content of each of these groups. This results in structured definitions for all of the information that is stored and used within a given system.

The technique provides a solid foundation for systems design and a universal standard for system documentation. Data modeling is an essential precursor to analysis & design, maintenance & documentation and improving the performance of an existing system.


Entity Relationship Diagram - Diagram Notation
Entity relationship diagramming uses a standard set of symbols to represent each of these defined data groups and then proceeds by establishing the relationships between them. The first of these symbols is the soft-box entity symbol. data modeling - entity soft box
An entity is something about which data will be stored within the system under consideration. In this example the data group invoice can be identified as a system entity.

The other main component on a data model is the relationship line. A Relationship is an association between two entities to which all of the occurrences of those entities must conform.

data modeling - example relationship
The relationship is represented by a line that joins the two entities, to which it refers. This line represents two reciprocal relationships:That of the first entity with respect to the second, and that of the second entity with respect to the first.

Entity relationship diagramming is all about identifying entities and their relationships and then drawing a diagram that accurately depicts the system. This applies equally to the design of a new system or the analysis of an existing one.

The end result of
entity relationship diagramming should be a clear picture of how information is stored and related within a proposed, or existing, system.

Entity Relationship Diagram - Entities
Here, we illustrate the concept of an entity, which can be applied to almost anything that is significant to the system being studied. Some examples of information systems and their entities are listed below:

Banking system: Customer, Account, Loan.

Airline system: Aircraft, Passenger, Flight, Airport.

An entity is represented by a box containing the name of that entity.

A precise definition of ‘entity’ is not really possible, as they even vary in nature. For example, in the airline system, whilst an aircraft is a physical object (entities often are) a flight is an event and an airport is a location. However entities are nearly always those things about which data will be stored within the system under investigation.

Note that entities are always named in the singular; for example: customer, account and loan, and not customers, accounts and loans.

This course uses symbols that are standard in the IT industry. This uses the soft-box symbol shown to represent an entity. If a site uses a different symbol set, this is not a problem, as
entity relationship diagramming techniques are the same regardless of the symbols being used.

Entity Relationship Diagram - Entity Types & Occurrence
Similar entity occurrences are grouped together and collectively termed an entity type. It is entity types that are identified and drawn on the data model.

An entity occurrence identifies a specific resource, event, location, notion or (more typically) physical object.

In this course the term 'entity' is, by default, referring to entity type. The term entity occurrence will be specifically used where that is relevant.

Each entity has a data group associated with it. The elements of the data group are referred to as the 'attributes ' of the entity. The distinction between what is an attribute of an entity and what is an entity in its own right is often unclear. This is illustrated shortly.

Entity Relationship Diagram - Entity Naming
Entity names are normally single words and the name chosen should be one familiar to the users. The entity name can include a qualifier in order to clarify their meaning. However, if different names are currently used to describe a given entity in different areas of the organization then a new one should be chosen that is original, unique and meaningful to all of the users.

For example, the terms 'signed contract', 'sale' and 'agreement' might be recreated as the entity 'completed'.

Conversely an organization may be using a 'catch all' term to describe what the analyst identifies as being a number of separate entities. For example the term 'invoice' may be being used to describe 3 invoice types - each of which is, in fact, processed in a different manner.

In this case prefixing the entity names with qualifiers, is likely to be the best solution.
Entity Relationship Diagram - Entity Identification
The process of identifying entities is one of the most important steps in developing a data model.

It is common practice for an experienced analyst to adopt an intuitive approach to entity identification, in order to produce a shortlist of potential entities. The viability of each of these potential entities can then be considered using a set of entity identification guidelines. This should result in some of the potential entities being confirmed as entities, whilst others will be rejected.

In this exercise you will be asked to identify a set of potential entities within a simple business scenario. This should help you to understand and appreciate the entity identification guidelines better.

Read the following case study. Study this information carefully and see if you can identify the entities - remember that entities are those things about which data will be stored.

Make your own list of those things that you think are likely to be entities, before moving to the next screen.

Entity Relationship Diagram - Entity Identification Case Study
City Cameras is an independent retailer of cameras, video-cameras and accessories. The owner fulfils the roles of shopkeeper and manager and he purchases a variety of products from a number of different suppliers.

The owner can check on different suppliers wholesale and recommended retail prices with reference to their price lists, as shown.

During a normal day several customers will enter the shop and a number of them will buy one or more of the products on sale.

At some stage the owner may decide that one or more product lines need to be re-ordered, following a visual stock-take. He will then consult the latest suppliers price lists to see who is offering the best deals on given product lines.

Following this, he will ring one or more of the suppliers to order some of their products. At the same time he will also make a written record of the orders that have been placed with each supplier on a separate sheet of paper. These records are then used to verify incoming orders and invoicing details.
Entity Relationship Diagram - Entity Identification – Exercise#1
With reference to the case study information, make a list of all of those things mentioned in the case study that could be entities - that is the potential entities.

Your list should look something like that shown below:

Suppliers Price List, Customer, Product, Order, Invoicing Details & Supplier

There are six potential entities listed. From this initial list we will consider the 'suppliers price list' to be a likely attribute of the entity 'supplier'. Therefore we shall consider this within the context of the supplier entity. The 'invoicing details' are stated to be attributes of the 'order record' entity, so we shall also discount this as a potential entity at this stage.

Remember that entities are described in the singular as they relate to entity types. 'Customer' for example represents the entity type 'customer' which encompasses an infinite number of 'customer' entity occurrences.

Taking these four as our list of potential entities, each will be discussed in turn:

Entity Relationship Diagram - Entity Identification – Exercise#2
In many business systems, information about the customer is of great importance. An insurance company or bank, for example, could not function without a customer database on which comprehensive personal details are stored. This customer database also serves as an essential resource for selling new financial products and services.

But how much customer information is likely to be stored by City Cameras?

Are they even going to record the name & address of their customers?

Interviews with the owner reveal the answer to be that he has no real interest in storing 'information' about his customers. He only records their details onto any necessary warranty documents and then sends these off to the appropriate supplier.

Therefore, in the context of this system customer is NOT an entity.

Entity Relationship Diagram - Entity Identification – Exercise#3
It is a natural assumption that all retail businesses would hold a significant amount of product information. However in this study the only level of product information is that which is held on the suppliers' price lists.

Lets look again at the suppliers price list in the case study. This confirms that product information is held within this system and it is apparent from the case study that products are of real interest.

So have we identified an entity?

At this stage it would be likely that product would be considered to be an entity. However, you will shortly see why the analysis phase needs to be iterative - enabling decisions to be altered later, if necessary.
Entity Relationship Diagram - Entity Identification – Exercise#4

data modeling - suppliers price list
Once again a natural assumption would be that a retail business would store substantial information about its' suppliers.

On requesting to see information about City Cameras' suppliers, the owner once again reaches for the suppliers' price lists.

Lets look again at the suppliers' price list in the case study. Each of these lists has the name, address and telephone number of the supplier on the first page. The suppliers' price list is the only place where City Cameras stores information about suppliers.

Whilst the early investigation indicated that 'product' was probably an entity, it now becomes apparent that the unique identification of a product and access to the product information is also only possible after locating the relevant suppliers price list.

It has now been established that all of the information that is stored in relation to the two potential entities 'product' and 'supplier' are held in the same place - the suppliers' price list. This means that the suppliers' price list is an entity and that both product and supplier represent information held within this entity.

Both supplier and product are therefore identified as being attributes of the entity 'suppliers price list'.

Entity Relationship Diagram - Entity Identification – Exercise#5
What about the potential entity: 'Order'. Investigation reveals that the re-ordering process consists of visual stocktaking on an ad-hoc basis, followed by mental recall of those suppliers that stock the identified products.

The appropriate suppliers price lists are then referred to for the up-to-date pricing information and contact details and the order is placed over the telephone. The owner keeps a written record of the orders he places, each order on a separate sheet of paper, and these are then filed. Let's look again at the record of an order, as shown in the case study.

This written order record is used to check against incoming products, to verify invoicing details and to chase orders that may be overdue. The 'order' is held as stored information and therefore 'order' does represent an entity.

Entity Relationship Diagram - Entity Identification – Exercise#6
Having started with six potential entities (suppliers price list, customer, product, order, invoicing details and supplier), the analysis has identified that only two of these are in fact entities.

We eliminated customer, as no customer information is recorded or stored within this retail outlet.

The stored information relating to both a product and a supplier was found to only exist within the suppliers' price list. Therefore Suppliers' Price List was identified as being the only entity amongst these three.

Order was confirmed as a system entity and the invoicing details were identified early on as being an attribute of this entity.

Even in this simple scenario it should be apparent that entity identification needs careful consideration. Interestingly, both of the entities that were identified existed as documents within the system. Entities are often synonymous with discrete information stores within a system - whether physical or electronic.

The precise definition of what is an entity and what is an attribute will not always be clear. Therefore the process of entity identification should be iterative, enabling the review of decisions made earlier. Remember, entity types are always named in the singular and this name then represents all of the occurrences of that entity type.
Entity Relationship Diagram - Entity Identification Guidelines
There are a variety of methods that can be employed when trying to identify system entities. There follows a series of entity identification guidelines, which should prove helpful to the inexperienced analyst:

An informal questioning approach can be adopted, in which the analyst asks targeted questions to determine what information is necessary and whether or not that information is recorded within the system.

During face to face discussions with users the nouns (or given names of objects) should be recorded - as these often indicate those things that are entities within a system.

The existing documentation often contains clues as to the information that needs to be held and once again the nouns in the text may indicate potential entities.

Every fact that is required to support the business is almost certainly an attribute (or data item). In turn each of these attributes will belong to an entity. If no 'parent' entity can be found for one or more of these low level facts, then this indicates that your entity search is incomplete.

However, don't get hung up on the initial analysis. Entity identification can continue once the drawing of the data model diagram has begun. As this diagram is developed and refined further entities may become apparent.
Entity Relationship Diagram - Attributes
Many different occurrences of a given entity type can usually be identified. In the gift shop example both of the entities 'order' and 'suppliers price list' had numerous occurrences.

Each entity type can always be described in terms of attributes, and these attributes will apply to all occurrences of that given entity type. In the camera shop example, all occurrences of the entity 'supplier' could be described by an identifiable set of attributes, including:

The Supplier Name, the Supplier Address, Telephone Number, etcetera.

A given attribute belonging to a given entity occurrence can only have one value. Therefore, if a supplier could have more than one address or telephone number then this should be determined before defining the attributes of that entity type.

In this example the defined entity may require two or three address and/or telephone number attributes. It is the maximum practical instances of a given attribute that should be catered for in the entity type definition.
Entity Relationship Diagram - Entity Keys
An entity is defined by its attributes. Furthermore, each entity occurrence can be uniquely identified, by using an attribute or a combination of attributes as a key.

The primary key is the attribute (or group of attributes) that serve to uniquely identify each entity occurrence. Consider the problem that might arise if the name and address of an individual were used as the primary key for identifying the patients within a hospital.

Take the example of a patient called David Smith living at 23 Acacia Avenue. He has a son also called David Smith living at the same address.

Name and Address would not necessarily provide a unique identifier and confusion could easily arise, potentially creating a mix up with the patient records.

For this reason, in a hospital system patients each have a Patient Number as their primary key.

If two or more data items are used as the unique identifier, then this represents a compound key. For example, a compound key used to identify a book could be 'Title' together with 'Author'.

There may be occasions of authors using a previously used title but not of an author using the same title for more than one of their own books.

Where several possible primary keys exist they are called candidate keys.

For example, a book could be identified, either by 'Title' together with 'Author' or by the widely used unique identifier for books - the ISBN number.

Where an attribute of one entity is a candidate key for another entity, it is termed a foreign key.

For example, the attribute 'Author' belonging to the entity Book is a foreign key within the entity Author. You may be able to think of some shortcomings to the use of this attribute as the primary key, for example two authors having the same name.

It is worth noting that entity relationships are often indicated by the presence of foreign keys.

Entity Relationship Diagram - Relationships

data modeling - example

The relationship is the association between two entities to which all of the occurrences of those entities must conform. The diagram shown represents the beginnings of a data model where the relationship between a manager and a department needs to be defined.

The entities on data models are linked by relationship lines and together these are the only two components that make up a data model diagram. A relationship is an association between two entities to which all of the occurrences of those entities must conform.

Every relationship line shows two reciprocal relationships:
That of the first entity with respect to the second and that of the second entity with respect to the first. In this example a manager is responsible for a department and a department is the responsibility of a manager.

Each relationship line has three distinct properties: Firstly the relationship link phrase, secondly the degree or cardinality of the relationship and thirdly the participation or optionality of the relationship. These three properties combine to form the relationship statement.

Entity Relationship Diagram - Relationship Link Phrase

data modeling - example

The first property of the relationship statement is the relationship link phrase. This should be a short description of the nature of the relationship, typically between three and five words long. It is always read clockwise with respect to the entities that it links, so in this example: 'Manager is responsible for department', and 'Department is responsibility of manager'.

If the same relationship were to be drawn with department on the left hand side then the positions of the link phrases would have to be reversed.

Entity Relationship Diagram - Relationship Cardinality
The second property of the relationship statement is the degree, or maximum cardinality, of the relationship. If an entity has a crowsfoot symbol drawn against it, then many occurrences of that entity may relate to the other entity. Conversely if no crowsfoot is drawn against it, at most one occurrence of that entity may relate to the other entity.

data modeling - example
In this example: Each company employs one or more employees, but Each employee is employed by only one company. This is called a one-to-many relationship. Maximum cardinalities may be combined to give another two relationship types, In this example:
data modeling - example
Each manager is responsible for only one department and each department is the responsibility of only one manager. This is called a one-to-one relationship.

And in this example:
data modeling - example
Each lecturer teaches one or more courses and each course is taught by one or more lecturers. This is called a many-to-many relationship.

To recap, three different relationship types have been illustrated, one-to-many, one-to-one and many-to-many.

Entity Relationship Diagram - Relationship Participation
The third and final property of the relationship statement is the participation or optionality. A solid line shows that an entity occurrence must be associated with each occurrence of the other entity. In this example:
data modeling - example
Each passenger must possess a ticket, and Each ticket must belong to a passenger. A dotted line shows that an entity occurrence may be associated with each occurrence of the other entity, In this example:

data modeling - example
Each book may be borrowed by a borrower, and Each borrower may borrow one or more books. Furthermore, these symbols can be combined. In this example:
data modeling -example
Each book may be recalled by a reservation, but Each reservation must be recalling a book.

Remember, there are only two components to a data model diagram, entities and relationships. A relationship is an association between two entities to which all of the occurrences of those two entities must conform.

There are three distinct properties of the relationship; firstly the relationship link phrase, secondly the degree or cardinality of the relationship and thirdly the participation or optionality of the relationship. These three properties are collectively termed the relationship statement.

Entity Relationship Diagram - Identifying Relationships
There are just two questions that need to be asked, in order to establish the degree of the relationship that exists between any two entities.

In order to identify the degree of the relationship between the entities X and Y the following two questions need to be asked.

Question 1
Can an occurrence of X to be associated with more than one occurrence of Y?

Question 2
Can an occurrence of Y to be associated with more than one occurrence of X?

Each of these questions can be answered 'Yes' or 'No' and both questions must be answered. This means that there are four possible outcomes as shown in the table.
data modeling - decision grid

The nature of the relationship associated with each outcome is as follows:

Option 1, Question1 equals Yes, Question2 equals No.
In this case a one-to-many relationship has been identified, represented by the relationship line shown.

Option 2, Question1 equals No, Question2 equals Yes
As in the first case a one-to-many relationship has been identified, represented by the relationship line shown.

Option 3, Question1 equals Yes, Question2 equals Yes
In this case a many-to-many relationship has been identified.

Many-to-many relationships may be shown in the early 'preliminary' data model in order to aid the clarity of these early diagrams. However, such relationships are invalid and are therefore always re-modeled using 'link entities' in later diagrams. This process is explained later in the course.

Option 4, Question1 equals No, Question2 equals No
In this case a one-to-one link has been identified.

Legitimate one-to-one relationships are rare and it is likely that this relationship is one that needs to be rationalized. The methods used to investigate one-to-one relationships and to re-model them where necessary are explained later in the course.

In a one-to-many relationship the entity at the 'one' end is normally referred to as the master, and the entity at the 'many' end referred to as the detail entity. Some analysts adopt the 'no dead crows' rule and avoid drawing crowsfeet pointing upwards. This ensures that detail entities are shown below the master entities to which they belong.

This makes the diagram clearer, although congestion may make this rule difficult to enforce throughout the data model.

Entity Relationship Diagram - Relationship Statements
The relationship statement is a formal description that encompasses the three properties of the relationship.

The relationship statement encompasses the three properties of the relationship. The first property is the relationship link phrase, the second the degree or cardinality of the relationship and the third the participation or optionality of the relationship.

PHP Session Management

PHP Session Management - Reference is mentioned below for your overview:

http://php.net/session

4 Ace Technologies

PHP - Object Oriented Programming (OOP) (PHP5)

Object Oriented Programming in PHP5, read basics from the following link:

http://www.php.net/manual/en/language.oop5.php

4 Ace Technologies

PHP - Getting Started

Do a general overview of PHP at the following site:

http://www.php.net/manual/en/getting-started.php

4 Ace Technologies

PHP Language Reference

Learn PHP from the following link:

http://www.php.net/manual/en/langref.php

It's basically language reference that will get you used to PHP syntax, commonly used methods, and what you can do in general with it.

4 Ace Technologies

PHP and Object Oriant Programming (OOP) Basics

Please read Object Oriented Basics from the following site:

http://php.net/oop

4 Ace Technologies

PHP Coding Standard

EXTRACTED FROM:
http://www.dagbladet.no/development/phpcodingstandard/ -- RECOMMENDED

Suggested reading is the above link, in case if does not work, read the following article.


Contents

* Introduction
o Standardization is Important
o Interpretation
o Accepting an Idea
* Names
o Make Names Fit
o No All Upper Case Abbreviations
o Class Names
o Class Library Names
o Method Names
o Class Attribute Names
o Method Argument Names
o Variable Names
o Array Element
o Reference Variables and Functions Returning References
o Global Variables
o Define Names and Global Constants
o Static Variables
o Function Names
* Documentation
o Comments on Comments
o Comments Should Tell a Story
o Document Decisions
o Use Headers
o Make Gotchas Explicit
o Interface and Implementation Documentation
o Directory Documentation
* Complexity Management
o Layering
o Open/Closed Principle
* Classes
o Different Accessor Styles
o Do Not do Real Work in Object Constructors
o Thin vs. Fat Class Interfaces
o Short Methods
* Process
o Code Reviews
o Create a Source Code Control System Early and Not Often
o Create a Bug Tracking System Early and Not Often
o Honor Responsibilities
* Formatting
o Brace {} Policy
o Indentation/Tabs/Space Policy
o Parens () with Key Words and Functions Policy
o If Then Else Formatting
o switch Formatting
o Use of continue,break and ?:
o One Statement Per Line
o Alignment of Declaration Blocks
* Server configuration
o HTTP_*_VARS
o PHP File Extensions
* Miscellaneous
o PHP Code Tags
o No Magic Numbers
o Error Return Check Policy
o Do Not Default If Test to Non-Zero
o The Bull of Boolean Types
o Usually Avoid Embedded Assignments
o Reusing Your Hard Work and the Hard Work of Others
o Use if (0) to Comment Out Code Blocks

Introduction
Standardization is Important
It helps if the standard annoys everyone in some way so everyone feels they are on the same playing field. The proposal here has evolved over many projects, many companies, and literally a total of many weeks spent arguing. It is no particular person's style and is certainly open to local amendments.
Good Points
When a project tries to adhere to common standards a few good things happen:

* programmers can go into any code and figure out what's going on
* new people can get up to speed quickly
* people new to PHP are spared the need to develop a personal style and defend it to the death
* people new to PHP are spared making the same mistakes over and over again
* people make fewer mistakes in consistent environments
* programmers have a common enemy :-)

Bad Points
Now the bad:

* the standard is usually stupid because it was made by someone who doesn't understand PHP
* the standard is usually stupid because it's not what I do
* standards reduce creativity
* standards are unnecessary as long as people are consistent
* standards enforce too much structure
* people ignore standards anyway

Discussion
The experience of many projects leads to the conclusion that using coding standards makes the project go smoother. Are standards necessary for success? Of course not. But they help, and we need all the help we can get! Be honest, most arguments against a particular standard come from the ego. Few decisions in a reasonable standard really can be said to be technically deficient, just matters of taste. So be flexible, control the ego a bit, and remember any project is fundamentally a team effort.

Interpretation
Conventions
The use of the word "shall" in this document requires that any project using this document must comply with the stated standard.

The use of the word "should" directs projects in tailoring a project-specific standard, in that the project must include, exclude, or tailor the requirement, as appropriate.

The use of the word "may" is similar to "should", in that it designates optional requirements.

Standards Enforcement
First, any serious concerns about the standard should be brought up and worked out within the group. Maybe the standard is not quite appropriate for your situation. It may have overlooked important issues or maybe someone in power vehemently disagrees with certain issues :-)

In any case, once finalized hopefully people will play the adult and understand that this standard is reasonable, and has been found reasonable by many other programmers, and therefore is worthy of being followed even with personal reservations.

Failing willing cooperation it can be made a requirement that this standard must be followed to pass a code inspection.

Failing that the only solution is a massive tickling party on the offending party.

Accepting an Idea

1. It's impossible.
2. Maybe it's possible, but it's weak and uninteresting.
3. It is true and I told you so.
4. I thought of it first.
5. How could it be otherwise.

If you come to objects with a negative preconception please keep an open mind. You may still conclude objects are bunk, but there's a road you must follow to accept something different. Allow yourself to travel it for a while.
Names
Make Names Fit
Names are the heart of programming. In the past people believed knowing someone's true name gave them magical power over that person. If you can think up the true name for something, you give yourself and the people coming after power over the code. Don't laugh!

A name is the result of a long deep thought process about the ecology it lives in. Only a programmer who understands the system as a whole can create a name that "fits" with the system. If the name is appropriate everything fits together naturally, relationships are clear, meaning is derivable, and reasoning from common human expectations works as expected.

If you find all your names could be Thing and DoIt then you should probably revisit your design.

Class Names

* Name the class after what it is. If you can't think of what it is that is a clue you have not thought through the design well enough.
* Compound names of over three words are a clue your design may be confusing various entities in your system. Revisit your design. Try a CRC card session to see if your objects have more responsibilities than they should.
* Avoid the temptation of bringing the name of the class a class derives from into the derived class's name. A class should stand on its own. It doesn't matter what it derives from.
* Suffixes are sometimes helpful. For example, if your system uses agents then naming something DownloadAgent conveys real information.

Method and Function Names

* Usually every method and function performs an action, so the name should make clear what it does: CheckForErrors() instead of ErrorCheck(), DumpDataToFile() instead of DataFile(). This will also make functions and data objects more distinguishable.
* Suffixes are sometimes useful:
o Max - to mean the maximum value something can have.
o Cnt - the current count of a running count variable.
o Key - key value.

For example: RetryMax to mean the maximum number of retries, RetryCnt to mean the current retry count.

* Prefixes are sometimes useful:
o Is - to ask a question about something. Whenever someone sees Is they will know it's a question.
o Get - get a value.
o Set - set a value.

For example: IsHitRetryLimit.

No All Upper Case Abbreviations

* When confronted with a situation where you could use an all upper case abbreviation instead use an initial upper case letter followed by all lower case letters. No matter what.

Do use: GetHtmlStatistic.
Do not use: GetHTMLStatistic.

Justification

* People seem to have very different intuitions when making names containing abbreviations. It's best to settle on one strategy so the names are absolutely predictable.

Take for example NetworkABCKey. Notice how the C from ABC and K from key are confused. Some people don't mind this and others just hate it so you'll find different policies in different code so you never know what to call something.

Example

class FluidOz // NOT FluidOZ
class GetHtmlStatistic // NOT GetHTMLStatistic

Class Names

* Use upper case letters as word separators, lower case for the rest of a word
* First character in a name is upper case
* No underbars ('_')

Justification

* Of all the different naming strategies many people found this one the best compromise.

Example

class NameOneTwo

class Name

Class Library Names

* Now that name spaces are becoming more widely implemented, name spaces should be used to prevent class name conflicts among libraries from different vendors and groups.
* When not using name spaces, it's common to prevent class name clashes by prefixing class names with a unique string. Two characters is sufficient, but a longer length is fine.

Example
John Johnson's complete data structure library could use JJ as a prefix, so classes would be:

class JjLinkList
{
}

Method Names

* Use the same rule as for class names.

Justification

* Of all the different naming strategies many people found this one the best compromise.

Example

class NameOneTwo
{
function DoIt() {};
function HandleError() {};
}

Class Attribute Names

* Class member attribute names should be prepended with the character 'm'.
* After the 'm' use the same rules as for class names.
* 'm' always precedes other name modifiers like 'r' for reference.

Justification

* Prepending 'm' prevents any conflict with method names. Often your methods and attribute names will be similar, especially for accessors.

Example

class NameOneTwo
{
function VarAbc() {};
function ErrorNumber() {};
var $mVarAbc;
var $mErrorNumber;
var $mrName;
}

Method Argument Names

* The first character should be lower case.
* All word beginnings after the first letter should be upper case as with class names.

Justification

* You can always tell which variables are passed in variables.

Example

class NameOneTwo
{
function StartYourEngines(&$someEngine, &$anotherEngine) {
$this->mSomeEngine = $someEngine;
$this->mAnotherEngine = $anotherEngine;
}

var $mSomeEngine;
var $mAnotherEngine;
}

Variable Names

* use all lower case letters
* use '_' as the word separator.

Justification

* With this approach the scope of the variable is clear in the code.
* Now all variables look different and are identifiable in the code.

Example

function HandleError($errorNumber)
{
$error = new OsError;
$time_of_error = $error->GetTimeOfError();
$error_processor = $error->GetErrorProcessor();
}

Array Element

Array element names follow the same rules as a variable.

* use '_' as the word separator.
* don't use '-' as the word separator

Justification

* if '-' is used as a word separator it will generate warnings used with magic quotes.

Example

$myarr['foo_bar'] = 'Hello';
print "$myarr[foo_bar] world"; // will output: Hello world

$myarr['foo-bar'] = 'Hello';
print "$myarr[foo-bar] world"; // warning message

Single or Double Quotes

* Access an array's elements with single or double quotes.
* Don't use quotes within magic quotes

Justification

* Some PHP configurations will output warnings if arrays are used without quotes except when used within magic quotes

Example

$myarr['foo_bar'] = 'Hello';
$element_name = 'foo_bar';
print "$myarr[foo_bar] world"; // will output: Hello world
print "$myarr[$element_name] world"; // will output: Hello world
print "$myarr['$element_name'] world"; // parse error
print "$myarr["$element_name"] world"; // parse error

Reference Variables and Functions Returning References

* References should be prepended with 'r'.

Justification

* The difference between variable types is clarified.
* It establishes the difference between a method returning a modifiable object and the same method name returning a non-modifiable object.

Example

class Test
{
var $mrStatus;
function DoSomething(&$rStatus) {};
function &rStatus() {};
}

Global Variables

* Global variables should be prepended with a 'g'.

Justification

* It's important to know the scope of a variable.

Example

global $gLog;
global &$grLog;

Define Names / Global Constants

* Global constants should be all caps with '_' separators.

Justification
It's tradition for global constants to named this way. You must be careful to not conflict with other predefined globals.
Example


define("A_GLOBAL_CONSTANT", "Hello world!");

Static Variables

* Static variables may be prepended with 's'.

Justification

* It's important to know the scope of a variable.

Example

function test()
{
static $msStatus = 0;
}

Function Names

* For PHP functions use the C GNU convention of all lower case letters with '_' as the word delimiter.

Justification

* It makes functions very different from any class related names.

Example

function some_bloody_function()
{
}

Error Return Check Policy

* Check every system call for an error return, unless you know you wish to ignore errors.
* Include the system error text for every system error message.

Braces {} Policy
Of the three major brace placement strategies two are acceptable, with the first one listed being preferable:

* Place brace under and inline with keywords:

if ($condition) while ($condition)
{ {
... ...
} }

* Traditional Unix policy of placing the initial brace on the same line as the keyword and the trailing brace inline on its own line with the keyword:

if ($condition) { while ($condition) {
... ...
} }

Justification

* Another religious issue of great debate solved by compromise. Either form is acceptable, many people, however, find the first form more pleasant. Why is the topic of many psychological studies.

There are more reasons than psychological for preferring the first style. If you use an editor (such as vi) that supports brace matching, the first is a much better style. Why? Let's say you have a large block of code and want to know where the block ends. You move to the first brace hit a key and the editor finds the matching brace. Example:

if ($very_long_condition && $second_very_long_condition)
{
...
}
else if (...)
{
...
}

To move from block to block you just need to use cursor down and your brace matching key. No need to move to the end of the line to match a brace then jerk back and forth.

Indentation/Tabs/Space Policy

* Indent using 4 spaces for each level.
* Do not use tabs, use spaces. Most editors can substitute spaces for tabs.
* Indent as much as needed, but no more. There are no arbitrary rules as to the maximum indenting level. If the indenting level is more than 4 or 5 levels you may think about factoring out code.

Justification

* When people using different tab settings the code is impossible to read or print, which is why spaces are preferable to tabs.
* Most PHP applications use 4 spaces.
* Most editors use 4 spaces by defalt.
* As much as people would like to limit the maximum indentation levels it never seems to work in general. We'll trust that programmers will choose wisely how deep to nest code.

Example

function func()
{
if (something bad)
{
if (another thing bad)
{
while (more input)
{
}
}
}
}

Parens () with Key Words and Functions Policy

* Do not put parens next to keywords. Put a space between.
* Do put parens next to function names.
* Do not use parens in return statements when it's not necessary.

Justification

* Keywords are not functions. By putting parens next to keywords keywords and function names are made to look alike.

Example

if (condition)
{
}

while (condition)
{
}

strcmp($s, $s1);

return 1;

Do Not do Real Work in Object Constructors
Do not do any real work in an object's constructor. Inside a constructor initialize variables only and/or do only actions that can't fail.

Create an Open() method for an object which completes construction. Open() should be called after object instantiation.
Justification

* Constructors can't return an error.

Example

class Device
{
function Device() { /* initialize and other stuff */ }
function Open() { return FAIL; }
};

$dev = new Device;
if (FAIL == $dev->Open()) exit(1);

Make Functions Reentrant
Functions should not keep static variables that prevent a function from being reentrant.

If Then Else Formatting
Layout
It's up to the programmer. Different bracing styles will yield slightly different looks. One common approach is:

if (condition) // Comment
{
}
else if (condition) // Comment
{
}
else // Comment
{
}

If you have else if statements then it is usually a good idea to always have an else block for finding unhandled cases. Maybe put a log message in the else even if there is no corrective action taken.

Condition Format
Always put the constant on the left hand side of an equality/inequality comparison. For example:

if ( 6 == $errorNum ) ...

One reason is that if you leave out one of the = signs, the parser will find the error for you. A second reason is that it puts the value you are looking for right up front where you can find it instead of buried at the end of your expression. It takes a little time to get used to this format, but then it really gets useful.

switch Formatting

* Falling through a case statement into the next case statement shall be permitted as long as a comment is included.
* The default case should always be present and trigger an error if it should not be reached, yet is reached.
* If you need to create variables put all the code in a block.

Example

switch (...)
{
case 1:
...
// FALL THROUGH

case 2:
{
$v = get_week_number();
...
}
break;

default:
}

Use of continue,break and ?:
Continue and Break
Continue and break are really disguised gotos so they are covered here.

Continue and break like goto should be used sparingly as they are magic in code. With a simple spell the reader is beamed to god knows where for some usually undocumented reason.

The two main problems with continue are:

* It may bypass the test condition
* It may bypass the increment/decrement expression

Consider the following example where both problems occur:

while (TRUE)
{
...
// A lot of code
...
if (/* some condition */) {
continue;
}
...
// A lot of code
...
if ( $i++ > STOP_VALUE) break;
}

Note: "A lot of code" is necessary in order that the problem cannot be caught easily by the programmer.

From the above example, a further rule may be given: Mixing continue with break in the same loop is a sure way to disaster.

?:
The trouble is people usually try and stuff too much code in between the ? and :. Here are a couple of clarity rules to follow:

* Put the condition in parens so as to set it off from other code
* If possible, the actions for the test should be simple functions.
* Put the action for the then and else statement on a separate line unless it can be clearly put on one line.

Example

(condition) ? funct1() : func2();

or

(condition)
? long statement
: another long statement;

Alignment of Declaration Blocks

* Block of declarations should be aligned.

Justification

* Clarity.
* Similarly blocks of initialization of variables should be tabulated.
* The ‘&’ token should be adjacent to the type, not the name.

Example

var $mDate
var& $mrDate
var& $mrName
var $mName

$mDate = 0;
$mrDate = NULL;
$mrName = 0;
$mName = NULL;

One Statement Per Line
There should be only one statement per line unless the statements are very closely related.

Short Methods

* Methods should limit themselves to a single page of code.

Justification

* The idea is that the each method represents a technique for achieving a single objective.
* Most arguments of inefficiency turn out to be false in the long run.
* True function calls are slower than not, but there needs to a thought out decision (see premature optimization).

Document Null Statements
Always document a null body for a for or while statement so that it is clear that the null body is intentional and not missing code.


while ($dest++ = $src++)
; // VOID

Do Not Default If Test to Non-Zero
Do not default the test for non-zero, i.e.


if (FAIL != f())

is better than


if (f())

even though FAIL may have the value 0 which PHP considers to be false. An explicit test will help you out later when somebody decides that a failure return should be -1 instead of 0. Explicit comparison should be used even if the comparison value will never change; e.g., if (!($bufsize % strlen($str))) should be written instead as if (0 == ($bufsize % strlen($str))) to reflect the numeric (not boolean) nature of the test. A frequent trouble spot is using strcmp to test for string equality, where the result should never ever be defaulted.

The non-zero test is often defaulted for predicates and other functions or expressions which meet the following restrictions:

* Returns 0 for false, nothing else.
* Is named so that the meaning of (say) a true return is absolutely obvious. Call a predicate IsValid(), not CheckValid().

The Bull of Boolean Types

Do not check a boolean value for equality with 1 (TRUE, YES, etc.); instead test for inequality with 0 (FALSE, NO, etc.). Most functions are guaranteed to return 0 if false, but only non-zero if true. Thus,


if (TRUE == func()) { ...

must be written


if (FALSE != func()) { ...

Usually Avoid Embedded Assignments
There is a time and a place for embedded assignment statements. In some constructs there is no better way to accomplish the results without making the code bulkier and less readable.


while ($a != ($c = getchar()))
{
process the character
}

The ++ and -- operators count as assignment statements. So, for many purposes, do functions with side effects. Using embedded assignment statements to improve run-time performance is also possible. However, one should consider the tradeoff between increased speed and decreased maintainability that results when embedded assignments are used in artificial places. For example,


$a = $b + $c;
$d = $a + $r;

should not be replaced by


$d = ($a = $b + $c) + $r;

even though the latter may save one cycle. In the long run the time difference between the two will decrease as the optimizer gains maturity, while the difference in ease of maintenance will increase as the human memory of what's going on in the latter piece of code begins to fade.

Reusing Your Hard Work and the Hard Work of Others
Reuse across projects is almost impossible without a common framework in place. Objects conform to the services available to them. Different projects have different service environments making object reuse difficult.

Developing a common framework takes a lot of up front design effort. When this effort is not made, for whatever reasons, there are several techniques one can use to encourage reuse:

Don't be Afraid of Small Libraries
One common enemy of reuse is people not making libraries out of their code. A reusable class may be hiding in a program directory and will never have the thrill of being shared because the programmer won't factor the class or classes into a library.

One reason for this is because people don't like making small libraries. There's something about small libraries that doesn't feel right. Get over it. The computer doesn't care how many libraries you have.

If you have code that can be reused and can't be placed in an existing library then make a new library. Libraries don't stay small for long if people are really thinking about reuse.

If you are afraid of having to update makefiles when libraries are recomposed or added then don't include libraries in your makefiles, include the idea of services. Base level makefiles define services that are each composed of a set of libraries. Higher level makefiles specify the services they want. When the libraries for a service change only the lower level makefiles will have to change.

Keep a Repository
Most companies have no idea what code they have. And most programmers still don't communicate what they have done or ask for what currently exists. The solution is to keep a repository of what's available.

In an ideal world a programmer could go to a web page, browse or search a list of packaged libraries, taking what they need. If you can set up such a system where programmers voluntarily maintain such a system, great. If you have a librarian in charge of detecting reusability, even better.

Another approach is to automatically generate a repository from the source code. This is done by using common class, method, library, and subsystem headers that can double as man pages and repository entries.

Comments on Comments
Comments Should Tell a Story
Consider your comments a story describing the system. Expect your comments to be extracted by a robot and formed into a man page. Class comments are one part of the story, method signature comments are another part of the story, method arguments another part, and method implementation yet another part. All these parts should weave together and inform someone else at another point of time just exactly what you did and why.
Document Decisions
Comments should document decisions. At every point where you had a choice of what to do place a comment describing which choice you made and why. Archeologists will find this the most useful information.
Use Headers
Use a document extraction system like ccdoc . Other sections in this document describe how to use ccdoc to document a class and method.

These headers are structured in such a way as they can be parsed and extracted. They are not useless like normal headers. So take time to fill them out. If you do it right once no more documentation may be necessary.

Comment Layout
Each part of the project has a specific comment layout.
Make Gotchas Explicit
Explicitly comment variables changed out of the normal control flow or other code likely to break during maintenance. Embedded keywords are used to point out issues and potential problems. Consider a robot will parse your comments looking for keywords, stripping them out, and making a report so people can make a special effort where needed.

Gotcha Keywords

* :TODO: topic
Means there's more to do here, don't forget.

* :BUG: [bugid] topic
means there's a Known bug here, explain it and optionally give a bug ID.

* :KLUDGE:
When you've done something ugly say so and explain how you would do it differently next time if you had more time.

* :TRICKY:
Tells somebody that the following code is very tricky so don't go changing it without thinking.

* :WARNING:
Beware of something.

* :PARSER:
Sometimes you need to work around a parser problem. Document it. The problem may go away eventually.

* :ATTRIBUTE: value
The general form of an attribute embedded in a comment. You can make up your own attributes and they'll be extracted.

Gotcha Formatting

* Make the gotcha keyword the first symbol in the comment.
* Comments may consist of multiple lines, but the first line should be a self-containing, meaningful summary.
* The writer's name and the date of the remark should be part of the comment. This information is in the source repository, but it can take a quite a while to find out when and by whom it was added. Often gotchas stick around longer than they should. Embedding date information allows other programmer to make this decision. Embedding who information lets us know who to ask.

Example

// :TODO: tmh 960810: possible performance problem
// We should really use a hash table here but for now we'll
// use a linear search.

// :KLUDGE: tmh 960810: possible unsafe type cast
// We need a cast here to recover the derived type. It should
// probably use a virtual method or template.

See Also
See Interface and Implementation Documentation for more details on how documentation should be laid out.

Interface and Implementation Documentation
There are two main audiences for documentation:

* Class Users
* Class Implementors

With a little forethought we can extract both types of documentation directly from source code.
Class Users
Class users need class interface information which when structured correctly can be extracted directly from a header file. When filling out the header comment blocks for a class, only include information needed by programmers who use the class. Don't delve into algorithm implementation details unless the details are needed by a user of the class. Consider comments in a header file a man page in waiting.
Class Implementors
Class implementors require in-depth knowledge of how a class is implemented. This comment type is found in the source file(s) implementing a class. Don't worry about interface issues. Header comment blocks in a source file should cover algorithm issues and other design decisions. Comment blocks within a method's implementation should explain even more.

Directory Documentation
Every directory should have a README file that covers:

* the purpose of the directory and what it contains
* a one line comment on each file. A comment can usually be extracted from the NAME attribute of the file header.
* cover build and install directions
* direct people to related resources:
o directories of source
o online documentation
o paper documentation
o design documentation
* anything else that might help someone

Consider a new person coming in 6 months after every original person on a project has gone. That lone scared explorer should be able to piece together a picture of the whole project by traversing a source directory tree and reading README files, Makefiles, and source file headers.

Open/Closed Principle
The Open/Closed principle states a class must be open and closed where:

* open means a class has the ability to be extended.
* closed means a class is closed for modifications other than extension. The idea is once a class has been approved for use having gone through code reviews, unit tests, and other qualifying procedures, you don't want to change the class very much, just extend it.

The Open/Closed principle is a pitch for stability. A system is extended by adding new code not by changing already working code. Programmers often don't feel comfortable changing old code because it works! This principle just gives you an academic sounding justification for your fears :-)

In practice the Open/Closed principle simply means making good use of our old friends abstraction and polymorphism. Abstraction to factor out common processes and ideas. Inheritance to create an interface that must be adhered to by derived classes.

Server configuration
This section contains some guidelines for PHP/Apache configuration.

HTTP_*_VARS
HTTP_*_VARS are either enabled or disabled. When enabled all variables must be accessed through $HTTP_*_VARS[key]. When disabled all variables can be accessed by the key name.

* use HTTP_*_VARS when accessing variables.
* use enabled HTTP_*_VARS in PHP configuration.

Justification

* HTTP_*_VARS is available in any configuration.
* HTTP_*_VARS will not conflict with exsisting variables.
* Users can't change variables by passing values.

PHP File Extensions
There is lots of different extension variants on PHP files (.html, .php, .php3, .php4, .phtml, .inc, .class...).

* Always use the extension .php.
* Always use the extension .php for your class and function libraries.

Justification

* The use of .php makes it possible to enable caching on other files than .php.
* The use of .inc or .class can be a security problem. On most servers these extensions aren't set to be run by a parser. If these are accessed they will be displayed in clear text.

Miscellaneous
This section contains some miscellaneous do's and don'ts.

* Don't use floating-point variables where discrete values are needed. Using a float for a loop counter is a great way to shoot yourself in the foot. Always test floating-point numbers as <= or >=, never use an exact comparison (== or !=).

* Do not rely on automatic beautifiers. The main person who benefits from good program style is the programmer him/herself, and especially in the early design of handwritten algorithms or pseudo-code. Automatic beautifiers can only be applied to complete, syntactically correct programs and hence are not available when the need for attention to white space and indentation is greatest. Programmers can do a better job of making clear the complete visual layout of a function or file, with the normal attention to detail of a careful programmer (in other words, some of the visual layout is dictated by intent rather than syntax and beautifiers cannot read minds). Sloppy programmers should learn to be careful programmers instead of relying on a beautifier to make their code readable. Finally, since beautifiers are non-trivial programs that must parse the source, a sophisticated beautifier is not worth the benefits gained by such a program. Beautifiers are best for gross formatting of machine-generated code.

* Accidental omission of the second ``='' of the logical compare is a problem. The following is confusing and prone to error.

if ($abool= $bbool) { ... }


Does the programmer really mean assignment here? Often yes, but usually no. The solution is to just not do it, an inverse Nike philosophy. Instead use explicit tests and avoid assignment with an implicit test. The recommended form is to do the assignment before doing the test:


$abool= $bbool;
if ($abool) { ... }


Use if (0) to Comment Out Code Blocks
Sometimes large blocks of code need to be commented out for testing. The easiest way to do this is with an if (0) block:

function example()
{
great looking code

if (0) {
lots of code
}

more code
}

You can't use /**/ style comments because comments can't contain comments and surely a large block of your code will contain a comment, won't it?
Different Accessor Styles
Implementing Accessors
There are two major idioms for creating accessors.
Get/Set

class X
{
function GetAge() { return $this->mAge; }
function SetAge($age) { $this->mAge = $age; }
var $mAge;
};

Get/Set is ugly. Get and Set are strewn throughout the code cluttering it up.

But one benefit is when used with messages the set method can transparently transform from native machine representations to network byte order.
Attributes as Objects

class X
{
function Age() { return $this->mAge; }
function Name() { return $this->mName; }

var $mAge;
var $mName;
}

$x = new X;

// Example 1
$age = $x->Age();
$r_age = &$x->Age(); // Reference

// Example 2
$name = $x->Name();
$r_name = &$x->Name(); // Reference

Attributes as Objects is clean from a name perspective. When possible use this approach to attribute access.
Layering
Layering is the primary technique for reducing complexity in a system. A system should be divided into layers. Layers should communicate between adjacent layers using well defined interfaces. When a layer uses a non-adjacent layer then a layering violation has occurred.

A layering violation simply means we have dependency between layers that is not controlled by a well defined interface. When one of the layers changes code could break. We don't want code to break so we want layers to work only with other adjacent layers.

Sometimes we need to jump layers for performance reasons. This is fine, but we should know we are doing it and document appropriately.

Code Reviews
If you can make a formal code review work then my hat is off to you. Code reviews can be very useful. Unfortunately they often degrade into nit picking sessions and endless arguments about silly things. They also tend to take a lot of people's time for a questionable payback.

My god he's questioning code reviews, he's not an engineer!

Not really, it's the form of code reviews and how they fit into normally late chaotic projects is what is being questioned.

First, code reviews are way too late to do much of anything useful. What needs reviewing are requirements and design. This is where you will get more bang for the buck.

Get all relevant people in a room. Lock them in. Go over the class design and requirements until the former is good and the latter is being met. Having all the relevant people in the room makes this process a deep fruitful one as questions can be immediately answered and issues immediately explored. Usually only a couple of such meetings are necessary.

If the above process is done well coding will take care of itself. If you find problems in the code review the best you can usually do is a rewrite after someone has sunk a ton of time and effort into making the code "work."

You will still want to do a code review, just do it offline. Have a couple people you trust read the code in question and simply make comments to the programmer. Then the programmer and reviewers can discuss issues and work them out. Email and quick pointed discussions work well. This approach meets the goals and doesn't take the time of 6 people to do it.

Create a Source Code Control System Early and Not Often
A common build system and source code control system should be put in place as early as possible in a project's lifecycle, preferably before anyone starts coding. Source code control is the structural glue binding a project together. If programmers can't easily use each other's products then you'll never be able to make a good reproducible build and people will piss away a lot of time. It's also hell converting rogue build environments to a standard system. But it seems the right of passage for every project to build their own custom environment that never quite works right.

Some issues to keep in mind:

* Shared source environments like CVS usually work best in largish projects.
* If you use CVS use a reference tree approach. With this approach a master build tree is kept of various builds. Programmers checkout source against the build they are working on. They only checkout what they need because the make system uses the build for anything not found locally. Using the -I and -L flags makes this system easy to setup. Search locally for any files and libraries then search in the reference build. This approach saves on disk space and build time.
* Get a lot of disk space. With disk space as cheap it is there is no reason not to keep plenty of builds around.
* Make simple things simple. It should be dead simple and well documented on how to:
o check out modules to build
o how to change files
o how to add new modules into the system
o how to delete modules and files
o how to check in changes
o what are the available libraries and include files
o how to get the build environment including all compilers and other tools

Make a web page or document or whatever. New programmers shouldn't have to go around begging for build secrets from the old timers.
* On checkins log comments should be useful. These comments should be collected every night and sent to interested parties.

Sources
If you have the money many projects have found Clear Case a good system. Perfectly workable systems have been built on top of GNU make and CVS. CVS is a freeware build environment built on top of RCS. Its main difference from RCS is that is supports a shared file model to building software.

Create a Bug Tracking System Early and Not Often
The earlier people get used to using a bug tracking system the better. If you are 3/4 through a project and then install a bug tracking system it won't be used. You need to install a bug tracking system early so people will use it.

Programmers generally resist bug tracking, yet when used correctly it can really help a project:

* Problems aren't dropped on the floor.
* Problems are automatically routed to responsible individuals.
* The lifecycle of a problem is tracked so people can argue back and forth with good information.
* Managers can make the big schedule and staffing decisions based on the number of and types of bugs in the system.
* Configuration management has a hope of matching patches back to the problems they fix.
* QA and technical support have a communication medium with developers.

Not sexy things, just good solid project improvements.

Source code control should be linked to the bug tracking system. During the part of a project where source is frozen before a release only checkins accompanied by a valid bug ID should be accepted. And when code is changed to fix a bug the bug ID should be included in the checkin comments.

Sources
You can try AllTasks.net for bug tracking.

Honor Responsibilities
Responsibility for software modules is scoped. Modules are either the responsibility of a particular person or are common. Honor this division of responsibility. Don't go changing things that aren't your responsibility to change. Only mistakes and hard feelings will result.

Face it, if you don't own a piece of code you can't possibly be in a position to change it. There's too much context. Assumptions seemingly reasonable to you may be totally wrong. If you need a change simply ask the responsible person to change it. Or ask them if it is OK to make such-n-such a change. If they say OK then go ahead, otherwise holster your editor.

Every rule has exceptions. If it's 3 in the morning and you need to make a change to make a deliverable then you have to do it. If someone is on vacation and no one has been assigned their module then you have to do it. If you make changes in other people's code try and use the same style they have adopted.

Programmers need to mark with comments code that is particularly sensitive to change. If code in one area requires changes to code in an another area then say so. If changing data formats will cause conflicts with persistent stores or remote message sending then say so. If you are trying to minimize memory usage or achieve some other end then say so. Not everyone is as brilliant as you.

The worst sin is to flit through the system changing bits of code to match your coding style. If someone isn't coding to the standards then ask them or ask your manager to ask them to code to the standards. Use common courtesy.

Code with common responsibility should be treated with care. Resist making radical changes as the conflicts will be hard to resolve. Put comments in the file on how the file should be extended so everyone will follow the same rules. Try and use a common structure in all common files so people don't have to guess on where to find things and how to make changes. Checkin changes as soon as possible so conflicts don't build up.

As an aside, module responsibilities must also be assigned for bug tracking purposes.
PHP Code Tags
PHP Tags are used for delimit PHP from html in a file. There are serval ways to do this. , , < script language="php" > , <% %>, and . Some of these may be turned off in your PHP settings.

* Use

Justification

* is always avaliable in any system and setup.

Example

// Will print "Hello world"

// Will print "Hello world"

< script language="php" > print "Hello world"; // Will print "Hello world"

<% print "Hello world"; %> // Will print "Hello world"

// Will print the value of the variable $street

No Magic Numbers
A magic number is a bare-naked number used in source code. It's magic because no-one has a clue what it means including the author inside 3 months. For example:

if (22 == $foo) { start_thermo_nuclear_war(); }
else if (19 == $foo) { refund_lotso_money(); }
else if (16 == $foo) { infinite_loop(); }
else { cry_cause_im_lost(); }

In the above example what do 22 and 19 mean? If there was a number change or the numbers were just plain wrong how would you know?

Heavy use of magic numbers marks a programmer as an amateur more than anything else. Such a programmer has never worked in a team environment or has had to maintain code or they would never do such a thing.

Instead of magic numbers use a real name that means something. You should use define(). For example:

define("PRESIDENT_WENT_CRAZY", "22");
define("WE_GOOFED", "19");
define("THEY_DIDNT_PAY", "16");

if (PRESIDENT_WENT_CRAZY == $foo) { start_thermo_nuclear_war(); }
else if (WE_GOOFED == $foo) { refund_lotso_money(); }
else if (THEY_DIDNT_PAY == $foo) { infinite_loop(); }
else { happy_days_i_know_why_im_here(); }

Now isn't that better?

Thin vs. Fat Class Interfaces
How many methods should an object have? The right answer of course is just the right amount, we'll call this the Goldilocks level. But what is the Goldilocks level? It doesn't exist. You need to make the right judgment for your situation, which is really what programmers are for :-)

The two extremes are thin classes versus thick classes. Thin classes are minimalist classes. Thin classes have as few methods as possible. The expectation is users will derive their own class from the thin class adding any needed methods.

While thin classes may seem "clean" they really aren't. You can't do much with a thin class. Its main purpose is setting up a type. Since thin classes have so little functionality many programmers in a project will create derived classes with everyone adding basically the same methods. This leads to code duplication and maintenance problems which is part of the reason we use objects in the first place. The obvious solution is to push methods up to the base class. Push enough methods up to the base class and you get thick classes.

Thick classes have a lot of methods. If you can think of it a thick class will have it. Why is this a problem? It may not be. If the methods are directly related to the class then there's no real problem with the class containing them. The problem is people get lazy and start adding methods to a class that are related to the class in some willow wispy way, but would be better factored out into another class. Judgment comes into play again.

Thick classes have other problems. As classes get larger they may become harder to understand. They also become harder to debug as interactions become less predictable. And when a method is changed that you don't use or care about your code will still have to be retested, and rereleased.


4 Ace Technologies