---------------------------------------------------------------------- Hi All Here is a brief recollection of our discussions in a simple form. Best, B Brief Recollections of CRE subcommittee. Present: Carleton DeTar, Efstratios (Stratos) Efstathiadis, Don Holmgren, Balint Joo, Enno Scholz, Jim Simone, Chip Watson (did I miss anyone?) Date: 8 Feb 2007 --------- i) What not to do. Basically the most contentious item so far (to Joo at least and no one strongly disagreed) was setting up the mapping between Nodes/Cores/MPI IDs and QMP IDs. Hence all those items which need these mappings (including file assembly, disassembly, copying etc etc) should be dropped. ii) What do users need/want? Carleton: "If I am asked how fast my code runs at the JLAB and the answer is needed tomorrow, having to compile my own code is a barrier to getting this answer. THe libraries should be there for me to use" Balint: "I counter by saying that if I get no user demand for a package, I won't install it or maintain the installation. If a project gets time on a machine, surely only that machine should install those packages and maintain them" and the situation becomes a little like the chicken and the egg argument at this point. Conclusion: We should in addition to our installations of QMP, QDP++, and Chroma als provide installations of QLA, QDP/C, QIO (standalone), QDPQOP at JLab and also at various sites. So our list of canonically installed SciDAC packages becomes: QMP QLA QDP/C QIO QDP++ QDPQOP CHROMA Action Item #1: Carelton to furnish us with pertinent version numbers (actually even better for me CVS Version Tags if the code is in CVS) for standalone QLA, QDP/C, QIO and QDP++. Action Item #2: Balint to fold these into the local Jlab build system for at least infiniband builds. The site install task should build them (no need for nightlies) Various nuances came out of the ensuing discussion: a) we will base our compilations on gcc-3.4.x (x==4?) with reasonable compilation flags. This will provide a 'base-line'. It may be that other compilers / flags do a better job. Hence any directory naming scheme for installation should allow one to distinguish between packages built with various compilers. However, bleeding edge and experimental fastest builds with novel compilers are not to be standardized for now (since the relevant compiler may not be available across all sites anyway) b) Rather than unifying the look and feel of the sites, one should have a uniform look and feel documentation that describes 1) The packages installed at the site with information about - package version - package architecture - package location - package compiler so that the users can find the packages for the particular subset of nodes they are running on. 2) The vagaries of each site's file storage (Dcache, NFS servers, copy commands etc) 3) Job submission and sample job scripts which can be quickly adapted. (This should probably include queue names, PBS server identification, links to the man pages for PBS commands as well as the sample job scripts) 4) Compilers - installations, locations, a good set of compiler flags The main idea here is that since the sites are not necessarily similar and may be hard to unify, at least it should be easy (same look and feel) to find the information about the particularities of each one. Action Item #3: Don to create a skeleton set of pages. Balint suggested the SciDAC house style if it is easy to use. iii) Common run command qrun or qcdrun (or an env var: $run )? One item of common concern is that the users need to know various incantations to get their code to run. On Gig-E, one has QMP_run.sh on Infiniband various versions of mpirun exist between JLab and fermilab. It would be nice and actually doable to create a single run command. All the information should be available from PBS (ie the node file or other PBS variables) so this should be straightforward to to. This set of objectives is in principle achievable. Further modifications should be driven by actual (rather than perceived) user demand. -- ------------------------------------------------------------------- Dr Balint Joo High Performance Computational Scientist Jefferson Lab 12000 Jefferson Ave, Mail Stop 12B2, Room F217, Newport News, VA 23606, USA Tel: +1-757-269-5339, Fax: +1-757-269-5427 email: bjoo@jlab.org (old email: bj@ph.ed.ac.uk) -------------------------------------------------------------------