Transcript
Model behavior: An Introduction to data models
Data models:
- Organizes data elements.
- Standardizes relation of data elements to each other.
- Standardizes relation of data elements to properties of real-world entities.
Database models:
- A type of data model.
- Determines a database’s logical structure.
- Determines how data can be stored, organized, and manipulated.
A normalized relational database and normalized star schema are two of the most common database models, each of which as its advantages.
Normalized relational database
- Uses tables to make databases independent of software applications.
Database normalization:
- Process of structuring relational database.
- Reduces data redundancy and improves data integrity.
- Organizes columns or attributes in tables.
- Enforces various data integrity constraints.
- Must meet requirements for previous normalization level before next level can be achieved.
Unnormalized data:
- May or may not have primary key.
- Table does not have duplicate records.
First normal form (1NF):
- Requires primary key.
- Requires all cells in tables to have single values.
- Move attributes that could have multiple values to separate table.
- Enforces various data integrity constraints.
- Create primary key to connect new table to old table.
Second normal form (2NF)
- 1NF plus removing table’s partial dependencies.
- All attributes must depend on entire primary key.
Third normal form (3NF):
- 2NF plus removing all transitive dependencies.
- Attributes may only depend on primary key.
Star schema:
- Post-relational data model.
- More generalized data model than relational data model.
- Most common data model in data warehouses.
- Unnormalized data.
- Consists of at least one fact table that references any number of dimension tables.
Fact table:
- Records measurements for specific events.
- Usually consists of numeric values and foreign key.
- Foreign key links to dimensional table containing descriptive information for numeric values.
- Records measurements at low level of detail or granularity.
- Results in accumulation of many records over time.
- Three types: Transaction, snapshot, and accumulating snapshot fact tables.
Dimension table:
- Has smaller number of records compared to fact table
- Number of attributes in their records can be very large.
- Defines many types of dimensions:
Time: (most common) Describe time at which events are recorded in fact table.
Range: Describe range of measurable quantities to simplify reporting.
Other: Tables for employees, geography, and products.
Comparison:
Advantages of normalized relational database versus star schema:
- Strictly reinforces data integrity to prevent anomalies like one-off inserts and updates.
- Greater flexibility in performing analytical tasks if they follow database model’s business logic.
- More easily supports many-to-many relationships between business entities.
Benefits of star schema versus normalized relational database:
Simpler reporting logic, especially as-of and period-over-period reporting.
Simpler join-logic:
- Improves performance on read-only operations like reporting and queries.
- Improves performance of aggregration operations.
- Efficiently build proprietary cubs for online analytical processing (OLAP) systems.
For more information, please refer to Whitepaper : Model Behavior: An Introduction to Data Models .
Topics : Data Governance,Data Modeling,
Products : ER/Studio Data Architect,ER/Studio Enterprise Team Edition,
Infographic : ER/Studio Data Architect
Model Behavior: An Introduction to Data Models
A data model organizes data elements and standardizes their relation to each other and the properties of real-world entities. A database model is a type of data model that determines a database’s logical structure and how data can be stored, organized, and manipulated in that database. A normalized relational database and an unnormalized star schema are two of the most common database models, each of which has its advantages.
ER/Studio Data Architect enables you to efficiently catalog your current data assets and sources across different platforms and track end-to-end data lineage. Simplify your data architecture with a common language leveraging consistent naming standards and data definitions. Easily specify the sensitive data objects that need heightened protection, to withstand audit scrutiny.