In multi core architecture, let us say we have M cores and N threads (N > M) and a process once assigned to a core can't migrate to another core's ready queue. We have to design scheduling policy to balance load.
I read about this area and found this -
1) Block multithreading, Interleaved multithreading and Simultaneous multithreading (https://en.wikipedia.org/wiki/Multithreading_(computer_architecture))
2) Round Robin, FIFO, preemptive scheduling
(https://en.wikipedia.org/wiki/Scheduling_(computing))
3) Threads inherit priority number from processes and go to relevant core
My question asks me to compute thread load before scheduling and then assign that thread to core which is less loaded. Is there any research work done in this area ? If not, is there any research work done which computes a thread's computational load at compile time ?

$\begingroup$In general, anything problem of the form "decide what this code is going to do" is undecidable (the undecidability of the halting problem is the best-known example of this). So, while you may be able to determine at compile time what some threads will do, you certainly can't do it for all of them.$\endgroup$
– David RicherbyOct 29 '16 at 14:35

In the load balancing problem you are given $M$ identical cores, and $N$ threads. All the threads are assumed to be available at the beginning, and all the thread run times are given before hand. Graham proved that list scheduling, taking the threads one-by-one and assigning each one to the machine that currently has the least load is a 2-approximation. He further proved that Longest processing time (or LPT) scheduling, where you sort the threads in decreasing order of running length before doing list scheduling is a 4/3-approximation algorithm. See, for example: https://www.cs.princeton.edu/~wayne/kleinberg-tardos/pdf/11ApproximationAlgorithms-2x2.pdf

Graham's original result has been extended over the years to cases where there are dependences between jobs (i.e., job $j$ needs to finish before job $k$ starts), and to where each job $i$ is not permitted to be scheduled before a given time $a_i$ (known as the arrival time), and to where the machines have different properties, and the jobs have requirements about what machine properties are required.

In the case where the job lengths and/or arrival times are not known before hand the problem becomes much harder. This becomes an online algorithm. The main research I am aware of in this area is for the case where the jobs do have dependences between them, but the dependences are restricted to fork-join patterns. The work in this area started (afaik) with the paper