Validation, Verification, and Testing

Comments (0)

Transcript of Validation, Verification, and Testing

Software testing The goal of testing is to discover problems with the software before it is put to use. Reviews and Inspections Development testing Development testing is a set of defect testing activities carried out by the developer teams who wrote the software themselves. Validation testing In validation testing, there are two distinct goals, or questions: Testing serves two main purposes: Validation: Verify that the software meets the customer requirements, Defect testing: Verify that the software does not contain incorrect or undesirable behaviour. Testing can only show the presence of errors, not their absence. Definition Software inspections and reviews consist of a systematic analysis that can be used to check all aspects, at all steps, of the software process, e.g.: Advantages over testing Since it is hard to distinguish if an undesired program behaviour is the cause of one or several errors, testing has to deal with errors one at a time, whereas inspection deals with all potential errors at once. Disadvantages over testing Since inspections and reviews are static tools, they may not uncover problems that arise from unexpected interactions between objects in the system, e.g. concurrency problems in parallel systems. System requirements, System design models, Program source code, The proposed system tests. Inspections and reviews are "static" techniques, i.e. the code does not need to be executed for verification. Mostly focus on the source code, but since it is a static process, it can be performed on any readable part of the software. Incomplete versions of the system can be inspected, whereas testing would require constructing a fake testing environment to simulate missing components. Inspections and reviews can consider broader quality attributes which may be difficult to evaluate in defect testing. Systematic nspections and reviews will normally consider all parts of the code preemptively, thus leaving no dark corners in which bugs can hide. Although these problems are hard to track down when defect testing or debugging, they are also very difficult to imagine and thus anticipate. Problems relating to system performance are also quite difficult to anticipate, whereas profiling will give an exact map of where in the software performance is lost. Inspections and review require lots of time, and require people who have not worked on a component to test it, making it expensive for small developer teams. Checklists To facilitate inspections and reviews, checklists of the most common types of faults are used, e.g.: Data faults: Uninitialized variables, unnamed constants, array bounds, buffer/memory overflows, data structure consistency. Control faults: Conditional statements, loop termination criteria, statement bracketing, completeness of if-then-else or case statements. I/O faults: Unexpected inputs, assignment of all output variables. Interface faults: Parameter count, type, and order, potential side effects. Exception faults: Catch all possible exceptions or return values. Unit testing Component testing System testing Systems testing is similar to components testing, yet applied to the system as a whole. For critical systems, the tests should be written by somebody who was not involved with development of the code being tested. Three levels of granularity: Unit tests: Test individual objects, Component tests: Test interfaces between objects within components, System tests: Test the system as a whole. Definition Unit testing is the process of testing individual program components, such as methods or object classes. What to test Unit tests should cover all aspects of an object's behaviour: When to test Unit tests should be automated whenever possible, e.g. by using a test automation framework that compiles and executes all tests automatically. Choosing test cases Since developing and running unit tests can be quite time consuming, the tests themselves should be as effective as possible, e.g.: Benefits Early defect detection: If tests are implemented before, or as soon as, a component is implemented, problems can be caught before they affect other parts of the system. This is the lowest level of testing, used to verify that each part of a system works as specified. These tests may be formulated, before, during, or after the design and/or development of the component. Unit tests are usually written by the developers themselves, but may also be used as a form of inspection and revision. Each unit test is usually a function that tests one particular aspect of the component in question. Test all operations associated with the object, e.g. constructors and methods. Set and check the value of all attributes associated with the object. Put the object in all possible states, and test all possible transitions from one state to the other. Unit tests should cover not only all correct uses of the object's interfaces, but also incorrect uses. Check for failures, e.g. that illegal operations fail the way they should. Ideally, unit tests should be run whenever any part of an object is modified and/or extended. Tests are not inherited, e.g. each subclass of an object class needs its own tests. Automated tests usually adhere to the same structure: Setup, test, and assert. If the unit tests are created by the developer, appropriate tests should be added as soon as additional features or functionality are added to the object such that they can be tested incrementally. The tests should show that, when used as expected, the component does what it is supposed to do, e.g. test using normal inputs. If there are any defects in the component, they will be revealed by the unit tests, e.g. test using abnormal inputs. The coverage of unit tests can be optimized by considering which sets of inputs will lead to different behaviour in the object, and make a test for each such set, and/or at the limits between sets. Verify modifications: If tests are run whenever a component is modified, it can be used to verify unanticipated side effects. Simplify debugging: When searching for the source of defects in later testing stages, unit tests help exclude potential problem sources. Design and documentation: If the unit tests are specified beforehand, they provide a template for the component's design and a source for documenting its use. Mock objects and harnesses Not all functional aspects of a component or object can be tested in complete isolation, e.g. functions which rely on other objects as parameters, or interactions with other objects. In such cases, in order to simplify the testing, mock objects are created which simulate the interacting objects to the degree necessary for the particular test. A set of objects simulating the environment in which the tested object exists and in which the tests are run is referred to as a test harness. Definition Component testing deals with composite components, i.e. components consisting of several interacting objects. Interface types The objects in a component do not always interact in the same way, and different types of interactions can lead to different types of errors. Error types Interface misuse: Parameters to a function or method may be of the wrong type or be given in the wrong order. Guidelines Examine the code and design to make sure you test all interfaces. The functionality of these objects as a component is accessed through a defined component interface, which should be tested against its specification, as in unit testing. Additionally, though, we now need to make sure that: The component functions correctly as a whole. The interactions between the objects within the component function correctly. Parameter interfaces: Data or control is passed to an object by changing or setting its internal values and/or parameters, or by calling its methods. Shared memory interfaces: Data is exchanged between objects by storing it in an area of memory accessible by both. Procedural interfaces: Sets of procedures which can be called by the components, e.g. individual steps of a process or utility functions. Message-passing interfaces: Using observers or a client-server architecture, data and control are transmitted using messages. Interface misunderstanding: The interface is used in a formally correct way, but with the wrong intent, i.e. the expected behaviour does not match the actual functionality, or the data provided does not satisfy additional constraints. Timing errors: When using shared memory or message passing, the producer and consumer of data or an event may be out of sync, e.g. the consumer misses the data/event supplied by the producer. Such interfaces require care or additional machinery to get the synchronization right. Essentially, we are testing the interfaces between interacting objects. Choose tests that take both normal and extreme cases in consideration. Test the outside interfaces with invalid parameters. For procedural interfaces, provide procedures that will fail, e.g. to test if the component accounts for this. For shared memory or message-passing interfaces, use stress-testing, i.e. flood the system with messages and/or data, as this will quickly reveal timing or synchronization problems. It differs in that this may also include interfaces to external components, or components by others, e.g. libraries. The system will produce emergent behaviour, both expected and unexpected, that cannot be tested within the individual components. Since the range of possible interactions usually grows exponentially with the system size, it is not possible to test them all systematically. Use cases or process models, described in the requirements, usually provide an effective set of functionality to be tested. Verification vs. validation The aim of verification is to check that the software meets its stated functional and non-functional requirements. Release testing User testing User testing is the final stage in the testing process in which actual users provide input and advice. Validation: Are we building the right product? Verification: Are we building the product right? In both cases, we are concerned with verifying that the software meets its requirements. These tests are usually done during, and after development, by developers, testers, and sometimes by the actual end-users. The aim of validation is to check that the software meets the customers expectations, which may not be adequately reflected in the requirements. Whether a software product is "fit for purpose" depends on several factors: Purpose: How critical is reliability in the domain of application, and what is the level of confidence required? Expectations: What tradeoff between reliability and availability are users willing to accept? Marketing: Is releasing low-reliability beta versions of the final product necessary for publicity? Definition Release testing is the process of testing a system that is intended for use outside of the development team, e.g. it will be released to the public. Requirements-based testing Remember that one of the criteria for good requirements is that they are verifiable, or testable. Scenario testing Scenarios are by-products of the requirements engineering process that describe ways in which the system might be used. Performance testing Performance testing consist of verifying that the system can process its intended load. This differs from development testing in two important aspects: A separate group, which has not been involved with the development, will be responsible for release testing. The focus is on checking the requirements, and not on finding bugs. The main goal is to provide evidence to the customer that the system is fit for purpose, e.g. it meets the customers expectations. In this process, the system is usually treated as a "black box", i.e. the testers have no information on how it is built, only on what it should do. Requirements-based testing is a systematic approach in which each requirement is considered and a set of tests is derived for it. Each requirement may need several individual tests, e.g. "Administrative staff should be able to add, delete and modify student records": Basic tests to add, delete and modify student records. Tests to add two identical students. Tests to modify a student such that it is identical to another student. Scenarios may be taken from the requirements directly, or developed from other functional requirements. Scenarios should be realistic: Recall that we are validating and verifying the requirements, not looking for bugs. Scenarios should be written as credible and fairly complex user stories, and should involve several correlated but different types of interactions with the system. Scenarios should be played through more than once with minor variations, e.g. making different deliberate mistakes. This usually consists of increasing the system load until it breaks or becomes otherwise unusable. Performance tests are usually run with automatically generated inputs, e.g. random transactions in a transaction management system. Although the inputs are random, they should match the characteristics of real data, e.g. ratio of types of transactions. By forcing the system to fail, its robustness features, e.g. recovery time from failure, can also be evaluated. We usually distinguish between three types of user tests: Alpha testing: Selected users test the software directly with the developers, e.g. at the developers site. Beta testing: A release of the software is provided to users who may relay their issues to the development team. Acceptance testing: The customer tests the system and decides whether or not it will be accepted and deployed. The acceptance criteria and tests are usually derived along with the system requirements and should, ideally, cover the latter. Guidelines Choose inputs that force the system to generate all error messages, e.g. do everything that the interface definition says you shouldn't. Design inputs that cause input buffers to overflow, e.g. make your objects deal with overflows instead of guessing a maximum size. Repeat the same input or series of inputs numerous times, e.g. verify that the objects' operations don't have hidden side effects. Force invalid outputs to be generated, e.g. how does your component deal with operations that fail? Force computation results to be too large or too small, e.g. make sure the output buffers don't overflow.