File Organisation - PowerPoint PPT Presentation

File Organisation. Placing File on Disk. File – a sequence of records Records Record type Record fields Data type Number of bytes in a field fixed Variable. Record Characteristics. A logical view: SELECT * FROM STUDENTS or (Smith, 17, 1, CS) , (Brown, 8, 2, CS) or

Copyright Complaint Adult Content Flag as Inappropriate

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

PowerPoint Slideshow about ' File Organisation' - pello

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Programs search for the desired record in the buffers, using the information in the file header.

If the address of the block with desired record is not known, the search programs must do a linear search through the file blocks. Each file block is copied into a buffer and searched either until the record is located or all the file blocks have been searched unsuccessfully.

The goal of a good file organization is to locate the block that contains a desired record with a minimal number of block transfers

DBMS memory is handled in units of a page, e.g. 4K, 8K. Pages in memory represent one or more hardware blocks from the disk

If a single item is needed, the whole block is transferred

Time taken for an I/O depends on the location of the data on the disk and is lower if the number of seek times and rotational delays are small, we remember that:access time = seek times + rotational delays + transfer times

The reason many DBMS do not rely on the OS file system is:

higher level DB operations, e.g. JOIN, have a known pattern of page accesses and can be translated into known sets of I/O operations

buffer manager can PRE-FETCH pages by anticipating the next request. This is especially efficient when the required data are stored CONTIGUOUSLY on disk

Open addressing: If location specified by hash address is occupied then the subsequent positions are checked in order until an unused (empty) position is found.

Chaining: various overflow locations are kept, a pointer field is added to each record location. A collision is resolved by placing the new record in an unused overflow location and setting the pointer of the occupied hash address location to the address of that overflow location.

Multiple hashing: A second hash function is applied if the first results in a collision.

The hashing scheme is called static hashing if a fixed number of buckets M is allocated.

If a record is to be retrieved with search condition specified for the key values, then the bucket number of the bucket potentially containing that record is determined using the hashing function applied on the key and then that bucket is examined for the containment of the desired record. If record is not in that bucket then further search could be activated in overflow buckets.

Instead, have a “family” of algorithms to manage dynamic expansion and contraction of the file

Start with a set number of M buckets 0..M-1 with hashing function mod M

Split them in linear order, when more space is needed. The next hashing function is mod 2M and subsequent 3M, 4M etc as required

Example: Block capacity is 2 records. Records with values 72, 62, 32 are colliding for hashing function (mod 10), but after application of next hashing function (mod 20) they do not (one bucket contains 72 and 32 and another 62).