multithreading - Requesting integer multiple of "M" cores per node on SGE -


i want submit multi-threaded mpi job sge, , cluster running in has different nodes each has different number of cores. let's number of threads per process m (m == omp_num_threads openmp) how can request job submitted sge queue run in such way in every node, integer multiple of m allocated job?

let's m=8, , number of mpi tasks 5 (so total of 40 cores requested). , in cluster, there nodes 4, 8, 12, , 16 cores. combination ok:

2*(8-core nodes) + 1*(16-core nodes) + 0.5*(16-core nodes) 

but of course not of these ones:

2*(4-core nodes) + 2*(8-core nodes) + 1*(16-core node) 2*(12-core nodes) + 1*(16-core node) (3/8)*(8-core nodes) + (5/8)*(8-core nodes) + 2*(16-core node) 

ps: there similar question, one: ( mpi & pthreads: nodes different numbers of cores ), mine different since have run m threads per mpi process (think hybrid mpi+openmp).

the best scenario run job exclusively on same kind of nodes. speed start time, want allow job run on different kind of nodes, provided each node has integer*m cores allocated job.

the allocation policy in sge specified on per parallel environment (pe) basis. each pe configured fill slots available on cluster nodes in specific way. 1 requests specific pe -pe pe_name num_slots parameter , sge tries find num_slots slots following allocation policy of pe_name pe. unfortunately, there no easy way request slots in integer multiples per node.

in order able request exactly m slots per host (and not multiple of m), sge administrator (or you, in case sge administrator) must first create new pe, let's call mpi8ppn, set allocation_rule 8, , assign pe each cluster queue. have submit job pe -pe mpi8ppn 40 , instruct mpi runtime start 1 process per host, e.g. -npernode 1 open mpi.

if above unlikely happen, other (unreliable) solution request high amount of memory per slot, close each node has, e.g. -l h_vmem=23.5g. assuming nodes configured h_vmem of 24 gib, request ensure sge won't able fit more 1 slot on each host. so, if start hybrid job on 5 nodes, ask sge 5 slots , 23.5g vmem each slot with:

qsub -pe whatever 5 -l h_vmem=23.5g <other args> jobscript 

or

#$ -pe whatever 5 #$ -l h_vmem=23.5g 

this method unreliable since not allow select cluster nodes have specific number of cores , works if nodes configured h_vmem of less 47 gb. h_vmem serves example here - other per-slot consumable attribute should do. following command should give idea of host complexes defined , values across cluster nodes:

qhost -f | egrep '(^[^ ])|(hc:)' 

the method works best clusters node_mem = k * #cores k being constant across nodes. if node provides twice number of cores has twice memory, e.g. 48 gib, above request give 2 slots on such nodes.

i don't claim understand sge , knowledge dates sge 6.2u5 era, simpler solutions might exist nowadays.


Comments

Popular posts from this blog

r - how do you merge two data frames the best way? -

How to debug "expected android.widget.TextView but found java.lang.string" in Android? -

php - mySQL problems with this code? -