Software Coordinating Committee Conference Call May 24, 2007 1:00 PM EDT Recorder: C. DeTar Present: DeTar, Levkova, Brower, Scholz, Holmgren, Simone, Osborn, Pochinski, Zhang, Gottlieb, Basak, Jung, Watson, Edwards, Absent: Efstathiadis, Alan Porterfield, Fowler, Mawhinney, Renner, Clark, Khoriaty, Joo, ===================================================================== ** Action items ================ Agenda ================== SciDAC conference Brower: Just after the SciDAC meeting on Friday, June 29 there will be workshops on the various SciDAC modules. Andrew Pochinski and James Osborn will give one on SciDAC ("C") on BG/L. See John Negele's announcement. Should we encourage someone to do a software poster at Lattice 2007? There will be an ILDG poster and a short plenary talk. 1.) USQCD participation in ILDG. Lattice available and projection of lattice into next year. (Send information to Derek Leinweber). USQCD seems to me to be a little behind the curve on this. (Balint, Carleton et al.) Brower: It looks like the US is a little behind everyone else. DeTar: Yes, we have been resting on our NERSC laurels. As usual it depends on finding people who have time to spare to do it. DeTar: It would be good to get things going before the Lattice conference. Simone: We currently have to update the certificate list by hand. Eventually there will be software for doing that. DeTar: I need to mark up the lattices we have stored so far. Brower: We should communicate with Derek when things are ready so he can include it in his talk. 2.) File format issues: (ILDG is asking about propagator formats too) DeTar: Having heard no objections, I am assuming everyone agrees with the proposed format I circulated last week. Jung: Will Chroma support the new propagator formats? Edwards: That is doable. DeTar: The code should support our standard formats for reading and writing. We have users at Fermilab who will be generating propagators with other codes and wanting to read them in Chroma. Edwards: I agree, we need a reader and writer for these formats. 3.) Code Optimization for BG/P. (Good new/bad news: More time since machine probably won't be available to mid fall BUT may have to have code ready to run immediately there after in short friendly user period.) a.) QMP port. b.) QLA and compiler benchmarks c.) Level 3 RHMC code for DW and Asqtad Jung: We have DWF running on the BG/L. Brower: Pavlos Vranas has moved from IBM to Livermore. Brower: James will work on QMP for the BG/P. Pochinski: IBM has a research lab in Cambridge where James and Andrew can access a BG/P. Dong Chen and Jim Sexton are the IBM lattice gauge theorists. DeTar: As for Asqtad, we are assuming we can use QOP/QDP/QLA. That works well on the BG/L for us. Jung: My understanding is that the communication is the main change from L to P. Pochinski: Timing of memory access is the another issue. DeTar: James, does QLA to any explicit prefetching, yet? Osborn: No. We just rely on the compiler to schedule it. I tried using xlc compiler intrinsics, but only seemed to create a worse compilation. Pochinski: The xlc compiler doesn't do much by itself. I played with prefetching in assembly code and it helps, but one has to be careful. Jung: My QCDOC experience was that gcc didn't change the order of operations much so it didn't move prefetching commands much. xlc seems to move it around. Pochinski: The more recent xlc may do better. The asm syntax is different. It moves toward gcc. That makes it possible to force the compiler to schedule it as you say. Brower: What is the status of your DWF code. Pochinski: I expect to have it by the end of June for the L and P. Brower: I would like to see a comparison in performance of your DWF and Chulwoo's (i.e. Pavlos's). Pavlos's code will not port easily to the P. I understood he was working on it at IBM. Edwards: I believe Dong Chen is the new IBM contact. Pochinski: We need to contact IBM to make sure we 4.) Combining QDP/C and QDP++ DeTar: Craig McNeile wants to call the QDP/C inverter from QDP++. He has problems with name clashes. I'll forward his message to you Robert. DeTar: Another request from Craig.was having a Dslash distinct from the inverter. 5.) Pochinski DWF inverter and Chroma Edwards: We can use the unpreconditioned inverter for valence. For HMC, we require the same preconditioning because of our force term. Pochinski: The force term calculation should be done at Level 3 as well. Brower: It would be good to have a specification of the force term as well so Andrew can think about optimizing it. 6.) Reporting failure rates on machines. Holmgren: The review last week went well. One question, though. The reviewers wanted to know what fraction of hours were on jobs that failed. We tried to get it from PBS logs. Is there any way to get it for jobs in 2006 and 2007? What can we do to ensure in the future we get the information? Jung: What about jobs that run for a week and fail, but can be restarted from the last checkpoint? Holmgren: It clearly requires cooperation from the users. We need to have some sort of survey as well. Users will be hearing from Bill Barowski. Future topics: When is a good time to have report on: * FNAL/Vanderderbilt cluster reliability project? * Visualization Project (Massimo)? * QMT -- Multi-thread desing issues? * Distribution of new algoriths and data analysis code in tool box? * Will the Workflow tools be ported to other facilities beyond FNAL? When and by whom? Committee conference concluded at 2:40 PM EDT. Next call June 1 at 1:00 PM EDT