After swap() is performed, x will contain the value 0 and y will contain 1; their values have been exchanged. This operation may be generalized to other types of values, such as strings and aggregated data types. Comparison sorts use swaps to change the positions of data.

The simplest and probably most widely used method to swap two variables is to use a third temporary variable:

define swap (x, y)
temp := x
x := y
y := temp

While this is conceptually simple and in many cases the only convenient way to swap two variables, it uses extra memory. Although this should not be a problem in most applications, the sizes of the values being swapped may be huge (which means the temporary variable may occupy a lot of memory as well), or the swap operation may need to be performed many times, as in sorting algorithms.

In addition, swapping two variables in object-oriented languages such as C++ may involve one call to the classconstructor and destructor for the temporary variable, and three calls to the copy constructor. Some classes may allocate memory in the constructor and deallocate it in the destructor, thus creating expensive calls to the system. Copy constructors for classes containing a lot of data, e.g. in an array, may even need to copy the data manually.

XOR swap uses the XOR operation to swap two numeric variables. It is generally touted to be faster than the naive method mentioned above; however it does have disadvantages. XOR swap is generally used to swap low-level data types, like integers. However, it is, in theory, capable of swapping any two values which can be represented by fixed-length bitstrings.

Containers which allocate memory from the heap using pointers may be swapped in a single operation, by swapping the pointers alone. This is usually found in programming languages supporting pointers, like C or C++. The Standard Template Library overloads its built-in swap function to exchange the contents of containers efficiently this way.[1]

As pointer variables are usually of a fixed size (e.g., most desktop computers have pointers 64 bits long), and they are numeric, they can be swapped quickly using XOR swap.

Because of the many applications of swapping data in computers, most processors now provide the ability to swap variables directly through built-in instructions. x86 processors, for example, include an XCHG instruction to swap two registers directly without requiring that a third temporary register is used. A compare-and-swap instruction is even provided in some processor architectures, which compares and conditionally swaps two registers. This is used to support mutual exclusion techniques.

XCHG may not be as efficient as it seems. For example, in x86 processors, XCHG will implicitly lock access to any operands in memory to ensure the operation is atomic, and so may not be efficient when swapping memory. Such locking is important when it is used to implement thread-safe synchronization, as in mutexes. However, an XCHG is usually the fastest way to swap two machine-size words residing in registers. Register renaming may also be used to swap registers efficiently.

With the advent of instruction pipelining in modern computers and multi-core processors facilitating parallel computing, two or more operations can be performed at once. This can speed up the swap using temporary variables and give it an edge over other algorithms. For example, the XOR swap algorithm requires sequential execution of three instructions. However, using two temporary registers, two processors executing in parallel can swap two variables in two clock cycles:

This uses fewer instructions; but other temporary registers may be in use, and four instructions are needed instead of three. In any case, in practice this could not be implemented in separate processors, as it violates Bernstein's conditions for parallel computing. It would be infeasible to keep the processors sufficiently in sync with one another for this swap to have any significant advantage over traditional versions. However, it can be used to optimize swapping for a single processor with multiple load/store units.