In this paper we address the problem of scheduling algorithms embodied with a mixture of nonmanifest-loops [1], variable-latency and fixed-latency units [2], [3] for high throughput DSP-applications. Nonmanifest loops are loops where the number of iterations needed for a calculation is data dependent and hence not known at compile time. The body of a non-manifest loop can either have fixed-latency or variable-latency. Variable latency units are hardware execution units, that will complete a given operation after a variable quantity of clock cycles. When designing an Application Specific Processor for high throughput applications, the task is to design the processor based on prior knowledge of the algorithm to be implemented. If the algorithm body can be represented as a directed acyclic graph, a static schedule can be obtained by assuming the worst case latency of the units. However such a schedule might be inefficient in terms of latency and throughput due to the worst case latency assumption. A dynamic hardware scheduler on the other hand can outperform a static scheduler by gaining those waisted clock cycles. In this paper we present a self scheduling hardware execution unit, based on ideas taken from dynamic data flow machines. This execution unit is capable of scheduling an algorithm body that contains a mixture of non-manifest loops, variable-latency and fixed-latency units without wasting any extra clock cycles