To Get a basic idea of parallel processing work on fragment shader and GPU memomry

I am new to GLSL and GPGPU . I want to know about how fragment shader for matrix multiplication / histogram works with parallel algorithm.

I only know it works on each pixel.

If I have a 5X3 Matrix (A) and 3X6 Matrix B I have to get C(5X6) matrix

if I represent A in texture what should be its size , How to represent it as texture?

if C is also texture whether the fragment program works on each pixels of C and how it is related total number of pixels of the system , how registers and memory is used how get an idea of the flow of the algorithm works with glsl . what are memory constraint to be aware to devlope GPGPU
applications with glsl.

I had read that if we try to implement Histogram computation more than 80 bins is not possible with GPGPU with normal Graphics cards,
My laptop having NVIDIA GT425m with 1 GB ram is it possible develop GPGPU program to compute Histogram.