lookup table is just array or string or some other data structure. the only importan thing is that this data block is populated with info we need, so it is easy (and fast) to retrieve and use. this is computer equivalent or cheat-sheet (you use part of memory to store imprtant things so they are ready to access quickly).

for example before it became standard for desktop computers to have math coprocessor, any floating point calculation was tedious and slow because it was done in software.
many programs (like video games) would simply read the entire sin/cos or whatever table from file, or calculate it (slowly) BEFORE game actually starts. then during game play, there is no lag. when you need to calculate position or trajectory of some bouncing sprite, instead of calculating sin(48) which could take dozens or even hundreds of clock cycles, you would simply read it from already calculated list of values stored into some array such as Arr_Sin[48] which is way faster - just one or two cycles. this is still used in micro controllers for same reasons (no math coporcessor so floating point calculations are slow).

the second question is simply about more efficient access to a block of data. if you are accessing one value, you may as well calculate the address, then get the data. but if you are accessing block of data, then it may be more convenient to use this addressing method because:

DPTR would point at the begin of the data block you need to read or copy or whatever. and instead of incrementing DPTR each time, you leave it as is, but let your instruction (MOVC or whatever) perform address calculation.

suppose you have 2-dimensional array storing student names.
each row is one name (say up to 50 characters). then you have 300 students so your array is 50x300 chars.

if you are to read (copy) name of student 139, your pointer would point to begin of that name, while you would use for-next loop to read all following characters (assuming we had to do it one by one). the efficiency is in letting hardware do address calculation (pointer + loop index) instead of performing additional calculation (wasting CPU clocks). in this case A would be value of the loop counter. in this case loop would only do up to 50 runs (saves 50 additions), but when you deal with larger data blocks, this can be huge saving.