4
Management of I/O devices is a main task for the operating system. –Handle message passing to and from the system –Handle interrupts and errors –Offer a simple interface –Offer device independence (to some degree) Viewpoints. –Users –Programmers –Technicians The O.S. and the I/O hardware

8
I/O Devices –Unit of transfer Blocks of characters versus streams of characters –Representation of data Coding conventions, parity,… –Error states Kind of errors, reporting modes,… Need a uniform approach to I/O as seen from the user and from the operating system point of view

9
Organisation The I/O function can be approached in three ways: Programmed I/O: continuous attention of the processor is required Interrupt driven I/O: processor launches I/O and can continue until interrupted Direct memory access: the dma module governs the exchange of data between the I/O unit and the main memory

16
Evolution in I/O organisation 1.Processor controls device directly 2.A controller or I/O module is used separating the processor from the details of the I/O device 3.As in 2 but using interrupts, the burden of supervising the device continuously disappears 4.I/O module gains access to main memory through DMA moving data in and out memory without processor attention 5.I/O module becomes a processor - I/O specific instruction set – to be programmed by the processor in main memory 6.I/O module becomes a computer – a “channel” -

17
Direct Memory Access I/O module shares the bus with the processor It can use the bus only when the CPU does not need it Or it can steal cycles from the CPU by forcing it to free the bus Procedure: –Processor sends message to DMA (R/W, where, how much) –Processor continues while I/O proceeds –At the end, I/O module sends interrupt signal to the processor

18
Differs from an interrupt DMA can interrupt the processor in the middle of an instruction (before fetch, after decoding, after execution) It does not interfere with the running program The CPU is idled (waiting for access to the bus) The system is just becoming slower

23
Principles of I/O Hardware Block Devices –Store information in fixed sized blocks –Size bytes. –Blocks are the addressing units for reading and writing –Example: disks –More problematic example: tapes Not randomly accessible Rewind takes far more time than a seek on a disk Writing in the middle of a tape is not always possible Discussion is not really a practical one

25
Principles of I/O Hardware (cont) Blocks and character devices are still a basic abstraction at the O.S. level (file systems) Not everything fits: –Clocks are not addressable, do not produce token streams, produce interrupts instead –memory mapped devices are character addressable Physical details and peculiarities are left to the Drivers

26
Hardware Principles: controllers I/O units have electronic and mechanical components separated in modular design. The electronic component is the “device controller” or “adapter” –Print to be plugged into the computer –Designed independently for many devices –Standardised interfaces (ANSI,ISO,IEEE,…) –Is the target for the O.S. –Bus I/O channels CPUMemDiskPrtr bus

28
Principles: controllers (cont) –Example: video card Reads bytes from own memory and generates signals to steer the CRT Programming the electron rays is clearly unfeasible for normal programmers Controller must offer an abstract interface Controller-CPU interface –Based on controller registers –These may be part of the normal address space of the computer (memory mapped I/O) (68x0)

29
Principles: controllers(cont) Specific address space for I/O (Intel ports IN and OUT instructions) In both cases the decoding logic on the card recognizes the addresses They are normally served by system calls (software interrupts) supporting ‘abstract’ commands such as READ, WRITE, SEEK, FORMAT, RECALIBRATE CPU passes commands to the controller and continues. It will be interrupted when the transaction has finished

31
Hardware Principles: DMA (cont) With DMA: Controller keeps a memory address (called the DMA address) and a counter indicating the number of bytes to be copied in addition to the disk address of the block(s) requested. Once block is in the controller buffer, the controller copies the bytes one by one to the DMA address, decrementing the counter and incrementing the DMA address Only when the counter becomes zero an interrupt is generated

32
Hardware Principles: DMA (cont) Why a buffer? –Hardware produces bits at a steady pace –Controller has to share the system bus with the CPU and other devices, unpredictable –Copying is not time critical –Controllers with small buffers may go directly to the memory, (overrun risk) –Buffer simplifies the design of the controller Controller processing time leads to interleaving

33
Hardware Principles: DMA (cont) Interleaving on disks –3600 rpm, revolution time 16 ms –This time can be used for error correction and (start up of) DMA transfer –Room between successive sectors can be adjusted by interleaving

37
Design objectives Efficiency –I/O is very slow in comparison with the CPU –Multiprogramming is partly invented to circumvent this discrepancy –Even with a high level of multiprogramming, the I/O cannot cope with the CPU –Swapping is not I/O but it sometimes interferes with it –Disk efficiency is important

38
Design objectives Generic –For simplicity and to avoid errors –Uniform treatment of all devices –Due to the diversity of the devices, only layered system can offer this uniformity –Layers depend on lower layers for specific services, implement general set of services themselves –Number of layers can be different from one kind of devices to another

40
Design objectives Simple case: logical device, offering only a stream of bytes or records –Logical I/O: has nothing to do with details of the device, offers logical I/O functions –Device I/O: translates commands from the layer above to sequences of I/O instructions. Buffering may be part of this layer –Scheduling and controlling: this level determines the order in which the requests can be served. This layer interacts directly with the I/O device

41
Design objectives Communication: network interface, offering connection to a network –Communications layer: replaces the logical layer. It may be layered itself according to e.g. the OSI seven layer model –Other layers are identical with the logical case

43
Efficiency: I/O Buffering I/O module may free the process of continuously checking by working autonomously If the process is swapped out, the dma cannot work, so swapping of the complete process becomes problematic One process can even cause a deadlock -> buffering (single buffer, double buffer, circular, buffer)

44
I/O Buffering: single buffer I/O module copies to a buffer, not directly to the memory. Once I/O has succeeded, buffer is copied to the user memory I/O module may anticipate and start reading the next block immediately Complicates OS –Has to manage the buffers –Has to take care while swapping to the same disk (swapping out a process has to wait until I/O has finished) Buffers can contain blocks, lines or bytes

45
I/O Buffering: double buffer Without buffers, the time per block is –T = Computation + Transfer The time for reading a block with a single buffer: –T = max{Computation, Transfer} + Move The Move time to copy the buffer into main memory can be recovered if one uses a second buffer, keeping the I/O channel busy all the time –T = max{Computation+Move, Transfer}

46
I/O Buffering: circular buffer If the computation time in I/O burst phases is very small, the two buffers will not be able to cope (max{C,T} >> C) Multiple buffers in a ring can solve this problem by doing I/O also in the computational phases Buffering softens the peaks in the I/O. If the demand for I/O is on average above the I/O capacity, the system will run out and the process will have to wait.

47
Generic : Software principles Layered architecture allows for –Hiding of the peculiarities of the hardware –Presentation of clean interfaces for the users –Construction of uniform interfaces After stating the objectives, we will discuss the layers from the interrupt level up to the user environment

48
From “Mobile Commerce, a new frontier”, V. Upkar, R.J. Vetter, R.Kalakota in Computer, October 2000

49

50

51
Software Objectives By Example Device independence: programs must be able to access files through a device independent interface –sort output –Naming conventions cannot depend on the device –Unix devices can be reached through the directory interface (mounting) Error handling: errors should be taken care of in the lowest layer possible (controller-driver-os- user) –Errors are often volatile, caused by a temporary defect –They can be handled without alarm to the user

53
Software: how to cope with interrupts “Hard” part of the software, –Should be kept in the basements Essentially pseudo parallel –Should be synchronized with the user process Semaphores and monitors can be used to model the interaction with the I/O process –E.g. socket threads Messaging.

55
Software: device drivers One type or at most one class of closely related devices. Contains all device dependent code. E.g. one terminal driver serving all terminals is not feasible: complexity of terminals can differ too much. Is the only part in the OS that knows the controllers, their register structure, sectors, tracks, cylinders, heads, interleaving, motors,... Device driver translates abstract, device independent software requests in device dependent signals.

56
Software: device drivers (cont) Example: read a block from disk –If driver is idle when request arrives Start immediately –Else Move request to queue Execute request: –Do a translation to the disk specifics Find position of the block Check whether disk is rotating Check arm position –Pass tasks to device one by one Some controllers can accept a list of tasks, others need assistance of the driver after each subtask

57
Software: device drivers (cont) Once the task(s) have been passed: –Two possibilities Driver has to wait until the controller has performed the task(s), in which case the driver has to block until interrupted –Example: disk I/O, scanner Or the controller can finish the job without delay –Example: move the graphics over the screen Driver has to check on possible errors before passing eventual results and/or status to the software Once all software tasks have been handled the driver will block

58
Software: Device Independence Large part of the I/O software is independent of the device. Some of these parts are executed in the driver for efficiency (cfr LINUX & Apache) Independent parts normally include –Uniform interfaces for the device drivers –Naming of the devices –Protection and privacy for devices –Block size independent of the device –Buffering –Memory assignment on block devices –Assignment of dedicated devices –Error handling

59
Software: Device Independence Uniform interfaces Software should not be aware of detail differences between the devices Users must be able to handle a disk as a disk and a scanner as a scanner, not work with a “Brother Master Scan” or a “Sister LaserJet” but with an A4 scanner and a black and white printer.

60
Software: Device Independence Naming of the devices How do files and I/O devices get a name Symbolic device name is mapped to a driver E.g. UNIX/LINUX: –Device name: /dev/lp determines I-node for special file –I-node contains Major device number to find the driver Minor device number to be passed as a parameter to the driver

62
Software: Device Independence Device independent block size Communication block sizes may differ –Software provides one size, handling a set of sectors as one block –Devices become abstractions –Block devices can be represented as character devices

63
Software: Device Independence Buffering Block devices: –Hardware delivers one block at a time –Software reads in varying sizes –After reading one block, the O.S. keeps the whole block in a buffer Character devices –May be slower than the processes –Delivery of characters may be an independent process (keyboard)

64
Software: Device Independence Assignment of dedicated devices Some devices cannot be used by two processes at the same time –OPEN system call reports an error in case the device is busy –CLOSE frees the device –Spooling processes may play an in between role

65
Software: Device Independence Error handling Primary handling is by the drivers –It can often react and repair the error before the software layer above notices –Only when, after several tries, repairing fails, the higher layers are informed Error handling at higher levels depends on the environment in which the software is functioning

72
Hardware Disks are organized in cylinders, tracks and sectors. Sectors –between 8 and 32 –equal number of bytes (1K, 0.5K,0.25K) in the middle as well as on the border Controller can run a number of seeks on different disks at the same time Controllers only read/write one disk at the time This influences the disk driver

73
Algorithms for the disk arm Access time = seek time : time needed to move the arm to the cylinder (dominant) + rotational delay : time before the sector appears under the head + transfer time : time to transfer the data Dominance of seek time leaves room for optimisation

75
Example: rotational latency and transfer time 3600 rpm, 16,7 ms/rotation Average rotational latency is 8,3 ms Tt=b/rN –Tt = transfer time –b = number of bytes to be transferred –N = number of bytes per track –r = rotation speed in rotations per second

77
Example: case In case the access is not sequential but at random for the sectors, we get: Time per sector = ,3 + 0,5 = 28,8 ms Total time 256 sectors = 256*28,8 = 7,37 s It is important to obtain an optimal sequence for the reading of the sectors

78
Optimisation Heavy loaded disk allows for a strategy to minimize the arm movement Situation is dynamical: disk driver keeps a table of requested sectors per cylinder E.g.: while a request for disk 11 is being handled, requests for 1, 36, 16, 34, 9 and 12 arrive. Which one is to be handled after the current request?

79
Optimisation (cont) In the order of arrival (FCFS) the total length is: –|11-1|+|1-36|+|36-16|+|16-34|+|34-9|+|9-12|=111 In the order “shortest seek time first, SSTF” (cfr shortest job first) we gain 50%: –|11-12|+|12-9|+|9-16|+|16-1|+|1-34|+|34-36|=61 Problem: starvation, arm stays in the middle of the disk in case of heavy load, edge cylinders are poorly served, the strategy is unfair (cfr elevators) “Lift algorithm, SCAN” : keep moving in the same direction until no requests ahead (start Up): –|11-12|+|12-16|+|16-34|+|34-36|+|36-9|+|9-1|=60 Upper limit: 2*number of cylinders

80
Optimisation (cont) Smaller variance is reached by moving the arm in one direction, always returning to the lowest number at the end of the road (CSCAN): –|11-12|+|12-16|+|16-34|+|34-36|+|36-1|+|1-9|=68 Seeks are cylinder by cylinder: – a number of tracks It may happen that the arms sticks to a cylinder. –N-step-SCAN: segment the queue into segments of N requests which are handled using SCAN –FSCAN: use two queues, while one is being handled with SCAN, the other one is being refilled.

81
Optimisation (cont) Sector number under the head can be read –> optimal order of requests within one cylinder More than one drive: look for a well positioned drive after each transfer Rotation speed versus seek time RAID (Redundant Array of Inexpensive Disks) –E.g. 38 bits form a word of 32 bits with Hamming code –Each bit is stored on a separate disk

83
Caches The driver may cache tracks: –Reading one track does not take a long time, arm needs not to be moved and the driver has to wait for the sector anyhow –Disadvantage: driver has to copy the data using the CPU, while the controller may use DMA The controller may build an internal cache: –Is transparent for the driver –Data transfer uses DMA –In this case the driver should not do any caching

85
Clock Hardware –50 Hz clocks (1 interrupt (clock tic) per voltage cycle) Simple, cheap, not very accurate, not very functional –High precision clocks (5-100 MHz, or higher), Contain a quarts oscillator Steers a counter counting down Generates an interrupt when counter reaches 0 Counter is eventually reloaded from a programmable register One chip normally implements multiple clocks One shot mode: clock counts down from register value once and waits for software to start it again Block wave mode: counter is automatically reloaded (generates clock tics) Ranges: e.g MHz clock with a 16 bits register can fix time intervals between 1 nanosecond and 65,535 microseconds.

86
Clock Software Software is responsible for the semantics behind the clock tics: –Time of the day 1/1/1970 Is an easy task, just calculate the exact time between two tics and adjust the clock on each interrupt Size of the time register may cause a problem: –32 bits register overflows after 2 years storing 60 Hz tics –64 bits is more expensive, but lasts forever –Store seconds in stead of tics (2 32 seconds is 136 years) –Use another reference in stead of 1/1/1970 (start time)

87
Clock Software –Administration of process time slices Each running process has “time left” counter This counter is decremented at each interrupt –Administration of CPU usage Counter starts when process starts Counter is part of the “Process environment” Is stopped while handling an interrupt Field in the process table can be used directly (through pointer to running process) Interrupts cause problems

88
Clock Software SLEEP system call (UNIX) –E.g. late ack of package sent, sleeping e-student Clock driver has a limited number of hardware clocks –Implements virtual clocks –Uses a table with all times for the hanging timers and one variable with the next signal time –In case of a heavy clock-usage, the signal times may be kept in a well ordered linked list (e.g. 4203,4307,4213) Clock header Current time Next signal 3s14s26s3

89
Clock Software Watchdog timers –Floppy disk drive Start motor Wait for 500 milliseconds -> better to wait for 3 seconds after I/O operation, just in case a new request arrives –Watchdog timers start user specified routine after the time has elapsed within the code of the caller Profiling –For program performance analysis –Information where the CPU time is spent on

93
Terminals, hardware (cont) Interface cards may have a CPU on board, and be able to serve more terminals Intelligent terminals may be able to perform complicated operations on the screen (e.g. X-terminals) while still connected through a prehistoric device as the RS UART

95
Terminals, hardware (cont) Monitor talks pixels, video controller modulates the electron ray or steers the lc’s Speed is obtained through the memory and the electronics The ram can be in characters or in bits –Characters will typically be read 14 times, for each line on the screen in which part of the character is displayed –Bits will be stored per pixel Input through the keyboard –Normally passes only a code for the key touched

96
Input software Keyboard driver caches keyboard input –Raw mode: Driver passes characters unchanged to software Buffering is limited to speed differences Application receives characters immediately –Cooked mode Driver buffers one line until it is finished Driver handles corrections made by the user while typing a line –Often applications have the choice Nowadays, window driven applications use raw mode at the lowest level and perform buffering at the window level

97
Input software (cont) Keyboard driver catches characters after –Interrupts: handles character during interrupt –Messages: message may contain the character or refer to a small buffer (problem with real time level of the driver, safer if the messaging is not fail proof) Keyboard driver transforms the key number into a character according to a table Keyboard buffering: buffer pool (buffers of equal size, e.g. 16 characters) or a separate buffer for each terminal (typically 200 characters) Echoing is (was) done by the OS, or the shell. May be confusing for the user.

98
Input software (cont) Handling of tabs, backspaces,… were typical problems with terminals One problem survived: end of line –Logically (from the typist’s viewpoint) one needs a CR to bring the cursor back to the beginning of the line and a LF to go to the next one –These two characters are hidden behind the ENTER key –The OS can decide how to represent end of line UNIX: Line feed only DOS: Carriage return and line feed –LF is ASCII 10, CR is ASCII 13, ^M -> problems with file transfer

100
Output software Serial (RS-232) and memory mapped approaches differ Serial terminals have an output buffer to which characters are sent until it is full or until a line ends. Once full, the real output is initiated and the driver sleeps until interrupted. Memory mapped terminals can be accessed through normal memory addressing procedures. Some characters receive a special treatment. The driver is doing more screen manipulation. Special functions such as scrolling and animation may be done through special registers (e.g. register with the position of the top line)