Nathan Moore wrote:
> Any suggestions? I figured that this would be a simple example to
> parallelize. Is there a better example for OpenMP parallelization?
> Also, is there something obvious I'm missing in the example below?
A few thoughts ...
Initialize your data in parallel as well. No reason not to. But
optimize that code a bit. You don't need
v_y = v_ground + (v_cloud-v_ground)*(j*dy/Ly)
boundary(i,j)=0
v(i,j) = v_y
when
v(i,j)= v_ground + (v_cloud-v_ground)*(j*dy/Ly)
boundary(i,j)=0
will eliminate the explicit temporary variable. Also the i.eq.0 test is
guaranteed never to be hit in the if-then construct, as with the j.eq.0.
You can (and should) replace that if-then construct with a set of loops
of the form
do j=1,Ny
boundary(Nx,j) = 1
end do
do i=1,Nx
boundary(i,Ny) = 1
end do
Also, what sticks out to me is that old_v may be viewed as "shared"
versus "private". I know OpenMP is supposed to do the right thing here,
but you might need to explicitly mark old_v as private. And dv for
that matter.
Note also that this inner loop is attempting to do a convergence test.
You are looking to set a globally shared value from within an inner
loop. This is not a good thing to do. This means accesses to that
globally shared variable are going to be locked.
I would suggest a slightly different inner loop and convergence test:
(note ... this relies on something I havent tried in fortran so
adjustment may be needed)
real*8 vnew(Nx,Ny),dv(Nx,Ny)
do i=1,Nx
do j=1,Ny
! notice that the if-then construct is gone ...
! vnew eq 0.0 for boundaries
vnew(i,j) = 0.25*(v(i-1,j)+v(i+1,j)+v(i,j+1)+v(i,j-1))*
dabs(boundary(i,j).eq.0)
dv(i,j) = (dabs(v(i,j)-vnew(i,j)) - convergence_v )*
dabs(boundary(i,j).eq.0)
end do
end do
! now all you need is a "linear scan" to find positive elements in
! dv. You can approach these as sum reductions, and do them in
! parallel
do i=1,Nx
sum=0.0
do j=1,Ny
sum = sum + dabs(dv(i,j) .gt. 0.0) * dv(i,j)
end do
if (sum .gt. 0.0) converged = 0
end do
The basic idea is to replace the inner loop conditionals and remove as
many of the shared variables as possible.
Also c.f. examples here: http://www.linux-mag.com/id/4609 specifically
the Riemann zeta function (fairly trivial).
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.comhttp://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615