fortran - Alignment of multi-dimensional array for omp simd -
if understand aligned
clause of omp simd
construct, refers alignment of whole array.
how used multi-dimensional arrays? assume
ni = 131; nj = 137; nk = 127 !allocates arr(1:131,1:137,1:127) aligned 64-bytes call somehow_allocate_aligned(arr, [ni,nj,nk], 64) !$omp parallel collapse(2) k = 1, nk j = 1, nj call some_complicated_subroutine(arr(:,j,k)) !$omp simd aligned(arr:64) = 1, ni arr(i,j,k) = arithmetic expression involving arr(i,j,k) end end end !$omp end parallel
is correct way indicate alignment of array although iteration of inner loop starts @ arr(1,j,k)
?
how compiler use information infer alignment of inner loop subarray?
does matter performance if run-time sizes nicer (say 128, 128, 128)?
it explained here, slides 160-165 : http://irpf90.ups-tlse.fr/files/parallel_programming.pdf
you should
1) align array
2) use padding force columns aligned : first dimension (specified in allocate statement) should multiple of number of elements reach 16, 32 or 64 -byte boundary depending on instruction set.
for example, 99x29x200 matrix avx instruction set (32 bytes alignment) in double precision (8 bytes/element), should do
n = 99 l = 29 m=200 delta_n = mod(n,32/8) if (delta_n == 0) n_pad = n else n_pad = n-delta_n+32/8 end if allocate( a(n_pad,l,m) ) !dir$ attributes align : 32 :: k=1,m j=1,l !$omp simd i=1,n a(i,j,k) = ... end end end
you can use c preprocessor make portable code replacing 32 , 8 in previous example.
note : careful using statements such b=a arrays, physical dimensions not correspond logical dimensions. practice set boundaries b(1:n,1:l,1:m) = a(1:n,1:l,1:m) still work if change physical dimensions.
Comments
Post a Comment