3 Outline MateriCompare and contrast conventional files and modern, relational databases.Define and give examples of fields, records, files, and databases.Describe a modern data architecture that includes files, operational databases, data warehouses, personal databases, and work group databases.Compare the roles of systems analyst, database administrator, and data administrator as they relate to databases.Describe the architecture of a database management systemDescribe how a relational database implements entities, attributes, and relationships from a logical data model.Transform a logical data model into a physical, relational database schema.Generate SQL code to create the database structure in a schema.

5 Conventional Files versus the DatabaseFile – a collection of similar records.Files are unrelated to each other except in the code of an application program.Data storage is built around the applications that use the files.Database – a collection of interrelated filesRecords in one file (or table) are physically related to records in another file (or table).Applications are built around the integrated databaseTeaching NotesThe point about data built around applications or vice-versa is illustrated on the next slide.

7 Pros and Cons of Conventional FilesEasy to design because of their single-application focusExcellent performance due to optimized organization for a single applicationConsHarder to adapt to sharing across applicationsHarder to adapt to new requirementsNeed to duplicate attributes in several files.No additional notes

8 Pros and Cons of DatabasesData independence from applications increases adaptability and flexibilitySuperior scalabilityAbility to share data across applicationsLess, and controlled redundancy (total non-redundancy is not achievable)ConsMore complex than file technologySomewhat slower performanceInvestment in DBMS and database expertsNeed to adhere to design principles to realize benefitsIncreased vulnerability due to consolidating data in a centralized databaseNo additional notes

9 FieldsField – the smallest unit of meaningful data to be stored in a databasethe physical implementation of a data attributePrimary key – a field that uniquely identifies a record.Secondary key – a field that identifies a single record or a subset of related records.Foreign key – a field that points to records in a different file.Descriptive field – any nonkey field.No additional notes.

10 Record – a collection of fields arranged in a predetermined format.RecordsRecord – a collection of fields arranged in a predetermined format.Fixed-length record structuresVariable-length record structuresBlocking factor – the number of logical records included in a single read or write operation (from the computer’s perspective).No additional notes

11 File – the set of all occurrences of a given record structure. Files and TablesFile – the set of all occurrences of a given record structure.Table – the relational database equivalent of a file.Types of conventional files and tablesMaster files – Records relatively permanent though values may changeTransaction files – Records describe business eventsDocument files – Historical data for review without overhead of regenerating documentArchival files – Master and transaction records that have been deletedTable lookup files – Relatively static data that can be shared to maintain consistencyAudit files – Special records of updates to other filesNo additional notes

12 Database technology usually predetermines and/or limits thisFiles and TablesPrevious file design methods required that the analyst specify precisely how the records in a database should be:Sequenced (File organization)Accessed (File access)Database technology usually predetermines and/or limits thisTrained database administrator may be given some control over organization, storage location, and access methods for performance tuning.No additional notes.

13 Data architecture – a definition of how:Files and databases are to be developed and used to store dataThe file and/or database technology to be usedThe administrative structure set up to manage the data resourceData is stored in some combination of:Conventional filesOperational databases – databases that support day-to-day operations and transactions for an information system. Also called transactional databases.Data warehouses – databases that store data extracted from operational databases.To support data miningPersonal databasesWork group databasesNo additional notes

14 A Modern Data ArchitectureTeaching NotesIt is useful to walk through this diagram.

16 Database ArchitectureDatabase architecture – the database technology used to support data architectureIncluding the database engine, database utilities, CASE tools, and database development tools.Database management system (DBMS) – special software used to create, access, control, and manage a database.The core of the DBMS is its database engine.A data definition language (DDL) is that part of the engine used to physically define tables, fields, and structural relationships.A data manipulation language (DML) is that part of the engine used to create, read, update, and delete records in the database, and navigate between different records in the database.No additional notes

17 Typical DBMS ArchitectureTeaching NotesPoint out everything above the DBMS block is traditionally called front-end while the DMBS and data are called back-end.Explain that metadata is “data about data” that describes the structure of the user data.

18 Relational DatabasesRelational database – a database that implements stored data in a series of two-dimensional tables that are “related” to one another via foreign keys.The physical data model is called a schema.The DDL and DML for a relational database is called SQL (Structured Query Language).Triggers – programs embedded within a database that are automatically invoked by updates.Stored procedures – programs embedded within a database that can be called from an application program.No additional notes

22 What is a Good Data Model?A good data model is simpleThe data attributes that describe an entity should describe only that entityA good data model is essentially nonredundantEach data attribute exists in at most one entity (except for foreign keys)A good data model should be flexible and adaptable to future needsThese goals are achieved through database normalization.No additional notes

23 Database Normalization (also see Chapter 8)An logical entity (or physical table) is in first normal form if there are no attributes (fields) that can have more than one value for a single instance (record).An logical entity (or physical table) is in second normal form if it is already in first normal form and if the values of all nonprimary key attributes are dependent on the full primary key.An logical entity (or physical table) is in third normal form if it is already in second normal form and if the values of all nonprimary key attributes are not dependent on other nonprimary key attributes .Teaching NotesThis is a review from Chapter 8.Logical entity = Physical table.

25 Goals of Database DesignA database should provide for efficient storage, update, and retrieval of data.A database should be reliable—the stored data should have high integrity and promote user trust in that data.A database should be adaptable and scalable to new and unforeseen requirements and applications.No additional notes

27 Database SchemaDatabase schema – a model or blueprint representing the technical implementation of the database.Also called a physical data modelNo additional notes

28 A Method for Database DesignReview the logical data model.Create a table for each entity.Create fields for each attribute.Create an index for each primary and secondary key.Create an index for each subsetting criterion.Designate foreign keys for relationships.Define data types, sizes, null settings, domains, and defaults for each attribute.Create or combine tables to implement supertype/ subtype structures.Evaluate and specify referential integrity constraints.Teaching NotesReferential integrity is discussed on the next slide.

32 Key integrity – Every table should have a primary key. Database IntegrityKey integrity – Every table should have a primary key.Domain integrity – Appropriate controls must be designed to ensure that no field takes on an inappropriate valueReferential integrity – the assurance tat a foreign key value in one table has a matching primary key value in the related table.No restrictionDelete: cascadeDelete: restrictDelete: set nullNo additional notes

34 Database Distribution and ReplicationData distribution analysis establishes which business locations need access to which logical data entities and attributes.CentralizationEntire database on a single server in one physical locationHorizontal distribution (also called partitioning)Tables or row assigned to different database servers/locations.Efficient access and securityCannot always be easily recombined for management analysisVertical distribution (also called partitioning)Specific columns of tables assigned to specific databases and serversSimilar advantages and disadvantages of HorizontalReplicationData duplicated in multiple locationsDBMS coordinates updates and synchronization of dataPerformance and accessibility advantagesIncreases complexityNo additional notes

35 Database Capacity PlanningFor each table sum the field sizes. This is the record size.For each table, multiply the record size times the number of entity instances to be included in the table (planning for growth). This is the table size.Sum the table sizes. This is the database size.Optionally, add a slack capacity buffer (e.g. 10percent) to account for unanticipated factors. This is the anticipated database capacity.No additional notes.