I'm trying to implement the shadow mapping, but I can't understanding the exact theory behind it. My problem lies in the second pass where I need to compare the depth stored in the shadow map with the current pixel depth.

Assume now in the second render pass I have the depth of one pixel, which value in shadow map should I compare to? how can I get that value?

I read some tutorial online, which says I need to calculate the distance between light source and the pixel in the second pass, but I don't know how.

Can someone elaborate the above part of the algorithm to me? Thanks in advance.

Sure, I'll give it a shot. The idea is you need to be comparing depth values in the same space. When you generated your shadow map, you used a certain model, view, and projection matrix for a piece of geometry. When you render the scene from the camera's point of view, you have a model, view and projection matrix as well, but this results in a value that's in a different space. When you render from the point of view of the camera (the second pass), you must also transform by the lights model, view and projection matrix. This will give you a value that's in the range [-1, 1] for X, Y and Z. Texture coordinates are in the range [0, 1] though, so you must scale the value by .5 (note for OpenGL you must also negate the Y texture coordinate). After doing this you'll have texture coordinates to look into the shadow map and see what depth value the light "saw". You also have your pixel that you transformed into the same space, so you know what its depth value is. If the depth value you have for the pixel you are drawing is less than the depth value in the shadow map, that means the pixel you are drawing is closer to the light than what the light saw. In other words, it's not occluded and thus should not be in shadow.

Sure, I'll give it a shot. The idea is you need to be comparing depth values in the same space. When you generated your shadow map, you used a certain model, view, and projection matrix for a piece of geometry. When you render the scene from the camera's point of view, you have a model, view and projection matrix as well, but this results in a value that's in a different space. When you render from the point of view of the camera (the second pass), you must also transform by the lights model, view and projection matrix. This will give you a value that's in the range [-1, 1] for X, Y and Z. Texture coordinates are in the range [0, 1] though, so you must scale the value by .5 (note for OpenGL you must also negate the Y texture coordinate). After doing this you'll have texture coordinates to look into the shadow map and see what depth value the light "saw". You also have your pixel that you transformed into the same space, so you know what its depth value is. If the depth value you have for the pixel you are drawing is less than the depth value in the shadow map, that means the pixel you are drawing is closer to the light than what the light saw. In other words, it's not occluded and thus should not be in shadow.

thanks very much for your help here. but I still have some confusions. so far, my problem is in the second pass where I transform object vertices from both the view of the camera and the view of the light. But I don't know how to interpolate vertices of light view for every fragment(i'm using my own software graphics pipeline).

Apologies, I omitted a lot and made some pretty big errors in my first post (you aren't in the same space because your view matrix varies between your shadow map pass and your "normal" pass; you need to offset in addition to scale to generate shadow map texture coordinates). Although I don't know the specifics of your implementation, you can probably pass your world space coordinate to the fragment shader as a texture coordinate. I'm sure there are smarter ways of doing this (like you alluded to in your initial post) that only involve interpolating depth.

Yeah, the most straightforward way to do this is to pass your world position (which of course you have access to in the vertex shader) to the pixel shader via a TEXCOORD semantic.

Then in your pixel shader, you can multiply that by the View * Projection matrix of your light (so your camera's V and P matrices are shader parameters used by your vertex shader as usual, and your light's V and P matrices are shader parameters used by your pixel shader). This should get you in the same space as what you used to generate your shadow map.

Yeah, the most straightforward way to do this is to pass your world position (which of course you have access to in the vertex shader) to the pixel shader via a TEXCOORD semantic.

Then in your pixel shader, you can multiply that by the View * Projection matrix of your light (so your camera's V and P matrices are shader parameters used by your vertex shader as usual, and your light's V and P matrices are shader parameters used by your pixel shader). This should get you in the same space as what you used to generate your shadow map.

but the world position in vertex shader is vertex-based, how do I know what's the world position for every fragment(or pixel)?

and what space is the xy axes of shadow map in? I think it's in window space which are all integers. but view * Projection matrix only take us to clip space right?