I have two questions here:
1. We all know the first element of array which is supplied as texture data corresponds to the lower-left corner of texture image, but in this
case the for loop just start by adding the directional vector with x > 0 , y > 0 and z > 0 to the first element of array, and the direction pointed
by that vector is not the lower-left corner if when we think the face of cubemap as texture image and view it from positive x-axis to the
negative x-axis. Is this code error or do I misunderstand something?

2. Why do we need to add the offset to change the components of directional vector?

I got stuck for this problem about two days, any suggestions or reference source will be appreciated.

1. We all know the first element of array which is supplied as texture data corresponds to the lower-left corner of texture image,

For a 2D texture, the first element is the texel nearest (0,0). Whether that's the "lower-left" corner is a matter of interpretation.
For the positive-X face of a cube map texture, the first element is the texel nearest the (1,1,1) corner of the cube. Also, the s coordinate increases along the negative Z axis while the t coordinate increases along the negative Y axis. The mappings of the cube map faces are given in the specification of the ARB_texture_cube_map extension which you linked to:

For a 2D texture, the first element is the texel nearest (0,0). Whether that's the "lower-left" corner is a matter of interpretation.
For the positive-X face of a cube map texture, the first element is the texel nearest the (1,1,1) corner of the cube. Also, the s coordinate increases along the negative Z axis while the t coordinate increases along the negative Y axis. The mappings of the cube map faces are given in the specification of the ARB_texture_cube_map extension which you linked to:

The offset is half the size of a texel, so adding the offset results in the vector containing the position of the texel's centre rather than one of its corners.

Wow, thanks for your reply, you are so clever.
By the way, may I ask you another question, that is, how did you acquire such kind of knowledge?
Which books or papers have you read in order to gain this kind of knowledge?
No other meaning, I Just felt curious, because I have finished the study of openGL red book and didn't gain this kind of knowledge.

Normalization cube maps were very useful in the past when GPUs either didn't have instructions that could perform the normalization of a vector (pre DX9 hardware) or on GPUs that had instructions for it, yet they didn't have enough ALU power thus using normalization instructions was expensive (early DX9 hardware), but nowadays it doesn't worth using normalization cube maps, but instead you should simply normalize vectors using ALU (e.g. using the normalize built-in function of GLSL).

This is because in the last ten years GPUs continued to increase their ALU power by a considerable amount and nowadays memory bandwidth and memory access latency is way more of a bottleneck than ALU operations. In fact, usually you can perform tens if not hundreds of ALU instructions at the same cost of a texture fetch, and while latecy hiding mechanisms do improve this ratio, it can definitely be concluded that normalization instructions will be always faster than using a normalization cube map. Not to mention that using ALU based normalization does not require additional memory and provides good precision without requiring to create large normalization cube maps with floating point components.

Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
Technical Blog: http://www.rastergrid.com/blog/