
What is a data dictionary and how is it used when communicating and managing requirements?
Definition
A data dictionary is a collection of the definitions of the structure of information that is relevant to a set of requirements. That’s a lot of words for a simple concept. We need to know (and constrain) a set of information about some business element when managing our requirements. We use a data dictionary to define what that information is, and any constraints on how it must be used.
Viewing The System
When using object oriented analysis (OOA) as part of defining requirements, we represent business concepts as objects and processes. For example, an order management system might define orders as having line items and customers. We can represent that information graphically with a UML diagram like the following:
In prose, we could also capture the same information as follows:
- The system shall include a representation of customer orders.
- Each order will have a single associated customer, and each customer can have multiple orders. Note that a customer is not required to have any orders.
- Each order will have at least one, and possibly multiple line items. Each line item is uniquely associated with a single order.
- Each line item represents a single product. Note that a product is not required to be represented in a line item. A product can be represented in multiple line items (even within the same quote).
While this diagram tells us about the structure at a high level, it doesn’t tell us enough information to go implement the solution. What exactly is a line item? What information does it contain? And what format must that information be in?
A Dictionary Entry
We could create a data dictionary entry for the line item object as follows:
Line Item
A line item represents a portion of a customer order that describes a product being ordered, as well as the quantity of that product being ordered. Each line item must include the following information:
- A reference to the product being ordered, using the product ID per constraint X1. [Note, the constraint is imposed by the existing product data management system, with which our software is required to integrate.]
- The quantity of the product being ordered, where the quantity is a positive integer. [Note, we would include a maximum value, if there were a constraint imposed by some other part of the system.]
Note that we have not specified that a line item includes a price. It is very likely that a line item would have a price, but we would be specifying implementation details if we did. Pricing may be done per product, or may be unique for each product for any given customer. Discounts may be applied based upon quantity of products in a line item, or dollar amount for an order. Discounts may be applied based upon all products ordered by a customer over a period of time. These different possibilities are a function of the requirements of the system.
When those business requirements are defined, they will dictate the ownership of properties by business objects. With that information, we can include the data as appropriate. For example, a list price property may be defined for the product object, or a customer-price may be defined for a line-item as a function of (product, quantity, customer). We would add that data as part of the business modeling. Note that this is a description of the problem domain, not a description of the implementation.
Another Data Dictionary Example
Here’s an example of a “Customer”
A customer represents the business or person for whom an order has been placed. Note that all character fields are to be represented in unicode 4.1.0 or later per corporate policy ABC. A customer has the following information:
- Name. 50 characters representing the name of the customer.
- Shipping Address 1. 100 characters representing the first line of the address to which all customer shipments are made.
- Shipping Address 2. 100 characters representing the second line of the address to which all customer shipments are made.
- Shipping Country. 50 characters…
- Billing Address.
- Customer Contact.
- etcetera.
This list is intended to show all of the elements of information that must be present in the “customer object” to support the requirements of the system.
Further Reading
Joe, at Seilevel wrote a post back in March with a good explanation of data dictionary entries. As Joe points out, requirements can drive the need for specific information.
For example, my business users have told me that the number of decimal places of each weight value tracked by the system is very important for monitoring and reporting. It stands to reason that other objects and attributes might require the same level of specification. If you figure it out once, you can use it in many places.
Barbara, at B2T Training points out the importance of understanding the details of the data for a system. She also touches on the value of having that information in a separate document.
Many BAs document data as part of the business process or part of the Use Case. Our recommendation is that you document data in a separate part of the requirements package because it is often used in multiple places.
Usage
A data dictionary should be defined as a repository of all data definitions like the examples above. Those examples should be referenced in all requirements documents that rely on the defined objects. Requirements documents should not specify the content of the objects, they should defer to the referenced dictionary entries.
Some projects, especially migration projects, have many constraints tied to data formats and structure. These projects will have extensive data dictionaries, and multiple references to entries throughout the requirements document. Other projects will have far fewer constraints on data formats, but will still have explicit structural definitions for business objects (like our line item example).
- – -
Check out the index of the Foundation Series posts which will be updated whenever new posts are added.


