Mayan Moudgill, White Plains US

Mayan Moudgill, White Plains, NY US

Patent application number

Description

Published

20080209166

Method of Renaming Registers in Register File and Microprocessor Thereof - A microprocessor for processing instructions comprises multiple clusters for receiving the instructions, each of the clusters having a plurality of functional units for executing the instructions, multiple register sub-files each having multiple registers for storing data for executing the instructions, wherein each of the clusters is associated with corresponding one of the register sub-files so that an instruction dispatched to a cluster is executed by accessing registers in a register sub-file associated with the cluster to which the instruction is dispatched, a register-renaming unit for renaming target registers in an instruction with registers in a register sub-file associated with a cluster to which the instruction is dispatched, and issue-queue units each of which is associated with a corresponding one of the clusters, wherein an issue-queue unit holds instruction renamed by the register-renaming unit until the renamed instruction is issued to be executed in a cluster associated with the issue-queue unit.

METHOD FOR ENABLING MULTI-PROCESSOR SYNCHRONIZATION - A method for providing at least one sequence of values to a plurality of processors is described. In the method, a sequence generator from one or more sequence generators is associated with a memory location. The sequence generator is configured to generate the at least one sequence of values. One or more read accesses of the memory location are enabled by a processor from the plurality of processors. In response to enabling the read access, the sequence generator is executed so that it returns a first value from the sequence of values to the processor. After executing the sequence generator, the sequence generator is advanced so that the next access generates a second value from the sequence of values. The second value is sequentially subsequent to the first value.

07-30-2009

20090276432

DATA FILE STORING MULTIPLE DATA TYPES WITH CONTROLLED DATA ACCESS - A method and apparatus for efficiently storing multiple data types in a computer's register or data file. A single data file can store data with a variety of sizes and number formats, including integers, fractions, and mixed numbers. The register file is partitioned into fields, such that only the relevant portions of the register file are read or written.

11-05-2009

20100031007

METHOD TO ACCELERATE NULL-TERMINATED STRING OPERATIONS - A method reads and compares first and second register values, each with a size of at least two bytes. A third register indicates a match if: (1) a byte in the first register value is equal to (or, alternatively, not equal to) a corresponding byte in the second register value, or (2) if a byte in the first register value is zero. Next, a fourth register value is set to one of the following: (1) a count of the matching byte, if the corresponding bytes in the first and second register values are equal (or, alternatively, are not equal), or (2) a number outside of a range between 0 and n−1, if the corresponding bytes in the first and second register values are not equal (or, alternatively, are equal). The value, n, is an integer equal to the number of bytes in the first and second register values.

02-04-2010

20100115527

METHOD AND SYSTEM FOR PARALLELIZATION OF PIPELINED COMPUTATIONS - A method of parallelizing a pipeline includes stages operable on a sequence of work items. The method includes allocating an amount of work for each work item, assigning at least one stage to each work item, partitioning the at least one stage into at least one team, partitioning the at least one team into at least one gang, and assigning the at least one team and the at least one gang to at least one processor. Processors, gangs, and teams are juxtaposed near one another to minimize communication losses.

05-06-2010

20100122068

MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD - A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

05-13-2010

20100199073

MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD - A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

08-05-2010

20100199075

MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD - A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

08-05-2010

20100228938

METHOD AND INSTRUCTION SET INCLUDING REGISTER SHIFTS AND ROTATES FOR DATA PROCESSING - A method includes identifying a first register with M bits and a second register with N bits. The process also includes shifting K bits, where K<=N, from the second register into the first register. The shifting operation executes a left shift operation including reading bits K . . . N−1 from the first register, writing bits K . . . N−1 into bit positions O . . . N−K−1 of the first register, reading K bits from the second register, and writing K bits from second register into bit positions N−K . . . N−1 of first register, or a right shift operation including reading bits O . . . N−K−1 from the first register, writing bits O . . . N−K−1 into bit position K . . . N−1 of the first register, reading the K bits from the second register, and writing K bits from second register into bit positions O . . . K−1 of first register.

ACCELERATING TRACEBACK ON A SIGNAL PROCESSOR - A method executed by an instruction set on a processor is described. The method includes providing a tbbit instruction, inputting a first index for the tbbit instruction, loading a second value for the tbbit instruction, wherein the second value comprises at least 2

10-28-2010

20100293210

SOFTWARE IMPLEMENTATION OF MATRIX INVERSION IN A WIRELESS COMMUNICATION SYSTEM - A digital signal processor is provided in a wireless communication device, wherein the processor comprises a vector unit, first and second registers coupled to and accessible by the vector unit; and an instruction set configured to perform matrix inversion of a matrix of channel values by coordinate rotation digital computer instructions using the vector unit and the first and second registers.

LATCH-BASED IMPLEMENTATION OF A REGISTER FILE FOR A MULTI-THREADED PROCESSOR - A processor register file for a multi-threaded processor is described. The processore register file includes, in one embodiment, T threads, having N b-bit wide registers. Each of the registers includes a b-bit master latch, T b-bit slave latches connected to the master latch, and a slave latch write enable connected to the slave latches. The master latch is not opened at the same time as the slave latches. In addition, only one of the slave latches is enabled at any given time. As should be apparent to those skilled in the art, T, N, and b are all integers. Other embodiments and variations are also provided.

10-06-2011

20110252211

HALTABLE AND RESTARTABLE DMA ENGINE - A method is described for operation of a DMA engine. Copying is initiated for transfer of a first number of bytes from first source memory locations to first destination memory locations. Then, a halt instruction is issued before the first number of bytes are copied. After copying is stopped, a second number of bytes is established, encompassing those bytes remaining to be copied. After the transfer is halted, a quantity of the second number of bytes is identified. Quantity information is then generated and stored. Second source memory locations are identified to indicate where the second number of bytes are stored. Second source memory location information is then generated and stored. Second destination memory locations are then identified to indicate where the second number of bytes are to be transferred. Second destination memory location information is then generated and stored.

10-13-2011

20110254588

POWER SAVING CIRCUIT USING A CLOCK BUFFER AND MULTIPLE FLIP-FLOPS - A circuit is described including a clock input for at least one clock signal. Only one clock buffer is connected to the clock input to generate, based on the at least one clock signal, at least a first modified clock signal and a second modified clock signal. A plurality of flip-flops are connected to the clock buffer. Each of the flip-flops receive the first and second modified clock signals. A plurality of data inputs are each connected to at least one of the plurality of flip-flops to provide input data to the plurality of flip-flops. A plurality of data outputs each are connected to at least one of the plurality of flip-flops to provide output data from the plurality of flip-flops. Each of the plurality of flip-flops transform the input data to the output data utilizing the first modified clock signal and the second modified clock signal.

10-20-2011

20120096243

MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD - A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.