One example of EAV modeling in production databases is seen with the clinical findings (past history, present complaints, physical examination, lab tests, special investigations, diagnoses) that can apply to a patient. Across all specialties of medicine, these can range in the hundreds of thousands (with new tests being developed every month). The majority of individuals who visit a doctor, however, have relatively few findings.
No doctor would ever have the time to ask a patient about every possible finding. Instead, the doctor focuses on the primary complaints, and asks questions related to these. Still other questions are based on the responses to previously asked questions. Other questions or tests may be related to findings in the physical exam. The presence of certain findings automatically rules out many others — e.g., one would not consider pregnancy, and medical conditions associated with it if the patient were a male.
When the patient's record is summarized, one typically records "positive" findings — e.g., the presence of an enlarged and hardened liver — as well as "significant negatives" — e.g., the absence of signs suggestive of alcoholism (which is one of the causes to a hard, enlarged liver). In any case, one would not record the vast number of non-relevant findings that were not looked for or found in this particular patient.
Consider how one would try to represent a general-purpose clinical record in a relational database. Clearly creating a table (or a set of tables) with thousands of columns is not the way to go, because the vast majority of columns would be null. To complicate things, in a longitudinal medical record that follows the patient over time, there may be multiple values of the same parameter: the height and weight of a child, for example, change as the child grows. Finally, the universe of clinical findings keeps growing: for example, diseases such as SARS emerge, and new lab tests are devised; this would require constant addition of columns, and constant revision of the user interface. (The situation where the list of attributes changes frequently is termed "attribute volatility" in database parlance.)
The following shows a snapshot of an EAV table for clinical findings. The entries shown within angle brackets are references to entries in other tables, shown here as text rather than as encoded foreign key values for ease of understanding. They represent some details of a visit to a doctor for fever on the morning of 1/5/98. In this example, the values are all literal values, but these could also be foreign keys to pre-defined value lists; these are particularly useful when the possible values are known to be limited.
The entity. For clinical findings, the entity is the patient event: a foreign key into a table that contains at a minimum a patient ID and one or more time-stamps (e.g., the start and end of the examination date/time) that record when the event being described happened.
The attribute or parameter: a foreign key into a table of attribute definitions (in this example, definitions of clinical findings). At the very least, the attribute definitions table would contain the following columns: an attribute ID, attribute name, description, data type, units of measurement, and columns assisting input validation, e.g., maximum string length and regular expression, maximum and minimum permissible values, set of permissible values, etc.
The value of the attribute. This would depend on the data type, and we discuss how values are stored shortly.
The example below illustrates symptoms findings that might be seen in a patient with pneumonia.