Start with something real simple and then build up from there. You should quickly see the pattern that you can generalize.

So imagine that you have a 2:1 MUX and want to use it to implement a function of 1 input, Y=f(A). There are four such functions:

Y=0
Y=1
Y=A
Y=A'

Can you see how you could wire the circuit to implement a particular one of those four as soon as you knew which one you were to implement? For some of the functions there are multiple ways to implement it, which if fine, but our purposes it works best if you apply one of the inputs (in this case it has to be A since that is the only input you have) to the Select input of the MUX and use only the other inputs (which in this case is just HI and LO since you have no other inputs) to the data pins.

Once you can do that, now make it a function of two inputs, Y=f(A,B). Use the same approach by applying one of the inputs to the Select line and either HI, LO, or one of the other inputs to the data pins. What you should find is that this is not sufficient for some functions, such as Y=A xor B. But with one other gate, you can make it happen.

Work that this much and if you still don't see how to generalize it to more than two inputs, show what you get for Y=A xor B using a single 2:1 MUX and one other gate and we will go from there.