When you design a database, you first want to normalize it. Main purpose is to avoid data duplication, because duplicate data takes up unnecessary space and is harder to maintain.

Suppose you want to store information about your customers. You want to store their address to send them promotional material. You also want to store what products they bought so far. If you'd put that in one table, you'd be repeating the customer's address for each article they bought. When one of them changes address, you need to remember to change all the records to update the address to avoid data inconsistency.

So you normalize this bit, and create a table with e.g. customer number + customer name + customer street + customer zip code/postal code, a second table with zip code + city, a third table with customer number + product number, a fourth table with product number + product description + vendor number, etc.

Now look at the I/O involved in getting at that data. When you put all the data in one table, accessing all the data will normally involve fewer I/O transactions and therefore be faster than accessing the data spread over multiple tables, which requires jumping back and forth from indexes to data records, as it . And despite the fact that I/O performance has improved tremendously since early days, it still is the slowest component in a computer.

Computers with slow I/O subsystems may also benefit from denormalisation. Denormalisation basically is the process of finding the balance between avoiding data duplication and ensuring database performance.

13y ago
Techniques of denormalization and its advantages and disadvanteges?
