Software Coordinating Committee Conference Call March 1, 2007 1:00 PM EST Recorder: C. DeTar Present: Brower, DeTar, Levkova, Fowler, Joo, Edwards, Jie Chen, Efstathiadis, Scholz, Renner, Basak, Gottlieb, Osborn, Pochinski, Jung, Simone, Holmgren Absent: Mawhinney, Clark, Watson, Zhang, Khoriaty, ===================================================================== ** Action items ============= Agenda ================= 1. Propagator format standard (Robert, James, Carleton et al) DeTar: I circulated a propagator format among a subcommittee of the affected parties. On this call we agreed on these standards: File XML: File layout: Use character strings to distinguish them. For example USQCD_DiracFermion_12 for 12 solution vectors LHPC_DiracPropagator for the dense LHPC format For staggered propagators, to be determined Gamma matrix convention Use character strings to specify the gamma matrix convention. It sounds like we all use the "DeGrand-Rossi" convention. Call this the USQCD convention. Record order for the 12 solution file: Pick one and require it. Then we don't need to list the contents. Datatype name defines the byte order: Use USQCD_F3_DiracFermion LHPC_DiracPropagator as a synonym for the current "Lattice" USQCD_F3_ColorMatrix as a synonym for the current QLA_F3_ColorMatrix 2. QMT Mult-Treading interface design and prototyping: (Robert et al see QMT slides on http:/super.bu.edu/~brower/scc ) Pochinski: We should postpone our discussion until we know more about the BG/Q. Edwards: We can make progress with our Opterons right now. Simone: We just need to be sure our implementations are flexible. Fowler: What benchmarks can we use? Simone: Wilson Dslash sounds like a good starting place Fowler: We would like to run some experiments. Edwards: After the inverter, global reductions are another problem. The list goes on. So picking a simple test case is what we should do. Edwards: I circulated a document last night, following our BU discussion last Fall. Jie Chen has a QMT implementation. [ See thread_vs_multiproc.txt ] Chen: We are experimenting with OpenMP for threading. Simone: But can you bind a process to a core? Chen: No. Fowler: In typical code there is a lot of sequential stuff between the multithreaded loops that become rate limiting. I haven't seen impressive success stories in OpenMP. Chen: Threading works well up to four cores. Edwards: The basic question is whether it outperforms the dumbest multiMPI implementation. We don't know, yet. Joo: We can do timing easily, but the outcome depends on how you tweak it. Osborn: Couldn't we run a simple test? Chen: I have been running only fundamental tests, but not with any physics application built on top of it. Brower: The last item in the All Hands meeting is a software discussion. This might be a good opportunity to alert the community about what is coming for multicore software. [ The call was concluded because of an audobridge conflict. The discussion will be continued in two weeks when Robert is again available. ] 3. Shared Memory Model for multi-core (Don) 4. Et al Committee conference concluded at 2:10 PM EST. Next call Mar 8 at 1:00 PM EST ======================================================================