04. January 2024 Version 2.4.2: 8th public release. - This version coincides with version 2.4.1 apart from the fact that all bugs listed for version 2.4.1 in the KNOWN_BUGS file are corrected. 18. May 2023 Version 2.4.1: 7th public release. - This version coincides with version 2.4 apart from the fact that all bugs listed for version 2.4 in the KNOWN_BUGS file are corrected. 01. May 2022 Version 2.4: 6th public release. - Transformed openQCD to a hybrid MPI/OpenMP program. With respect to version 2.0, the functionality is otherwise essentially unchanged. - Implemented a more robust solver for the little Dirac operator, which should be able to cope with the case, where the operator has a few exceptionally low modes. - Improved the error handling in the programs for the DFL_SAP_GCR solver. Fatal errors now only occur, if the solver is in fact unable to solve the Dirac equation to the required precision. - The computation of the little Dirac operator was reworked so as to make it more transparent and better suited for threaded processing. - Added a module utils/futils.c that permits the status variables in the force and action programs to be handled in a unified fashion. - Added two modules, archive/iodat.c and msfcts/fiodat.c, which provide standard functions for configuration input and output. - Rebased the code for the random number generator on integer arithmetic and added AVX inline assembly to the time-critical parts of the code. The algorithm is unchanged, but the interfaces are now such that each OpenMP thread gets a private instance of the generator. - Made an effort to eliminate duplicate code from the main programs. These are now significantly shorter and should be more readable. - Corrected the known bugs of the previous version as well as various small errors in the documentation. - Revised the documentation file doc/parms.pdf. - Added the documentation file doc/dfl.pdf. - Revised the top README file and the README files the directory "main". 11. November 2019 Version 2.0: 5th public release. - Implemented the SMD (Stochastic Molecular-Dynamics) simulation algorithm and added main programs (qcd2.c, ym2.c) that run such simulations. See README.qcd2 and README.ym2. A new module, update/smd.c, has been added containing the basic functions implementing the algorithm. Functions for the rotation of the momentum and pseudo-fermion fields are included in the modules mdflds/mdflds.c and forces/force{1,..,5}.c. - Reworked the field I/O programs in modules/archive/ so as to allow for blockwise parallel I/O in a universal format. See main/README.io for further explanations. Added a main program (cvt1.c) that can be used to convert gauge field configurations from one supported output format to another (see README.cvt1). - Adapted the check programs in devel/archive so as to check all field I/O programs. - Large sums of double-precision floating-point numbers are now performed in quadruple precision (see doc/qsum.pdf). The basic quadruple-precision arithmetic functions are contained in a new module utils/qsum.c. - All solvers (CGNE, MSCG, SAP_GCR, DFL_SAP_GCR) now admit two types of user-selectable stopping criteria, the traditional one based on the standard square norm of the spinor fields and a second one based on the uniform norm of the fields (see doc/unorm.pdf and doc/parms.pdf). The functions calculating the uniform norm of single- and double-precision spinor fields are contained a new module sflds/unorm.c. - Added two new functions to the module linalg/liealg.c, one calculating the uniform norm of fields with values in the SU(3) Lie algebra and the other flipping the sign of such fields. - Added a check program (forces/check12.c) that allows the numerical precision of the pseudo-fermion actions and forces calculated in QCD simulations to be studied as a function of the solver tolerances. - The memory required for the fields and buffers allocated by the simulation programs has been substantially reduced with respect to the previous release. A memory-occupation report is printed to the log file at the start of the simulations. - Streamlined the workspace administration (file utils/wspace.c) and added support for a shared workspace for single- and double-precision spinor fields [program "wsd_uses_ws()"]. - Removed the double-precision basis elements of the deflation subspace from the deflation block grid. These occupied a significant amount of memory, but were effectively obsolete. - Reduced the memory space required for the pseudo-fermion fields by storing only the field variables on the even lattice points in the case of the actions ACF_TM1_EO_SDET, ACF_TM2_EO, ACF_RAT_SDET and ACF_RAT. Moreover, the fields now no longer include communication buffers at their ends and thus contain either VOLUME or VOLUME/2 spinors. - Reduced the memory space required for the chronological solver by storing only the field variables on the even lattice points in the case of the forces FRF_TM1_EO_SDET, FRF_TM2_EO, FRF_RAT_SDET and FRF_RAT. - The programs for the pseudo-fermion actions may now optionally make use of the chronological solver. And the programs that rotate the pseudo- fermion fields in the SMD algorithm optionally update the chronological solution stacks used by the associated forces and actions. - Added two modules, swalg.c and swexp.c, in modules/sw_term/ containing various functions required for the implementation of an alternative "exponential" variant of the O(a)-improvement terms (see doc/dirac.pdf). The variant is fully supported and can be chosen at runtime through a parameter passed to the lattice parameters data base (flags/lat_parms.c). - Added a module utils/array.c that allows multi-dimensional arrays of any type to be allocated and handled safely without pointer gymnastics. - Added a new module directory, modules/dft/, containing a set of modules implementing the fast Fourier transform of lattice fields of any type in four dimensions. These programs are needed when calculating statistical errors in master-field simulations. - Added a new module directory, modules/msfcts/, containing a set of modules providing basic support for computing statistical errors in master-field simulations. - Added two measurement main programs, xms1.c and xms2.c, illustrating the the computation of observables and statistical errors in master-field simulations. - Corrected the program name_size() [utils/mutils.c] so as to make it safe against buffer overflows when file names contain wide decimal numbers. - Corrected the known bugs of the previous version as well as various small errors in the documentation. - Revised the documentation file doc/parms.pdf. - Added the documentation files doc/dft.pdf, doc/qsum.pdf and doc/unorm.pdf. - Revised the top README file. 12. October 2016 Version 1.6: 4th public release. - Added phase-periodic boundary conditions in the space directions for the quark fields (thanks to Isabel Campos). See doc/dirac.pdf and doc/parms.pdf for details. - The function chs_ubnd() [lattice/bcnds.c] has been removed since its functionality is now contained in set_ud_phase() and unset_ud_phase() [uflds/uflds.c]. - The error functions have been moved to a new module utils/error.c and the error_loc() function has been changed so that it calls MPI_Abort() when an error occurs in the calling MPI process. There is then no use for the error_chk() function anymore and so it has been removed. - There is a new module utils/hsum.c providing generic hierarchical summation routines. These are now used in most programs where such sums are performed, thus leading to an important simplification of the programs (see linalg/salg_dble.c, for example). - Corrected flags/rw_parms.c so as to get rid of compiler warnings when NPROC=1. - Corrected the notes in utils/endian.c. - In main/ym1.c deleted superfluous second call of alloc_data() in the main program. - In main/ms4.c replaced "x0=(N0-1)" by "x0==(N0-1)" on line 185 and "if (sp.solver==SAP_GCR)" by "else if (sp.solver==SAP_GCR)" on line 491 (thanks to Abdou Abdel-Rehim for noting these bugs). - In random/ranlux.c inserted error message in check_machine() to make sure the number of actual MPI processes matches NPROC. - In the main programs qcd1.c and ym1.c interchanged the order of init_ud() and init_rng(). - Corrected avx.h in order to get rid of some Intel compiler warnings. - Corrected a bug in cmat_vec() and cmat_vec_assign() [linalg/cmatrix.c] that could lead to a segmentation fault if the option -DAVX is set, the argument "n" is odd and if the matrix "a" is not aligned to a 16 byte boundary (thanks to Agostino Patella). - Corrected the use of alignment data attributes so as to fully comply with the syntactic rules described in the gcc manuals. - Reworked the module su3fcts/chexp.c in order to make it safer against aggressively optimizing compilers. - Included the AVX su3_multiply_dble and su3_inverse_multiply_dble macros in the module su3fcts/su3prod.c. Modified the inline assembly code in prod2su3alg() in order to make it safer against aggressively optimizing compilers. - Included new inline assembly code in avx.h that makes use of fused multiply-add (FMA3) instructions. These can be activated by setting the compiler option -DFMA3 together with -DAVX. Several modules now include such instructions, notably cmatrix.c, cmatrix_dble.c, salg.c and salg_dble.c. - Adapted all check programs to the changes in the modules. - Modified devel/nompi/su3fcts/time1.c so as to more accurately measure the time required for su3xsu3_vector multiplications. - Added an "LFLAGS" variable to the Makefiles that allows the compiler options in the link step to be set to a value different from CFLAGS. - Revised the documentation files doc/dirac.pdf and doc/parms.pdf. - Revised the top README file. - Revised modules/flags/README. 22. April 2014 Version 1.4: 3rd public release. - Changed the way SF boundary conditions are implemented so that the time extent of the lattice is now NPROC0*L0 rather than NPROC0*L0-1. - Adapted all modules and main programs so as to support four types of boundary conditions (open, SF, open-SF and periodic). For detailed explanations see main/README.global, doc/gauge_action.pdf, doc/dirac.pdf, and doc/parms.pdf. - The form of the gauge action near the boundaries of the lattice with SF boundary conditions has been slightly modified with respect to the choice made in version 1.2 (see doc/gauge_action.pdf; the modification only concerns actions with double-plaquette terms). - The programs for the light-quark reweighting factors now support twisted-mass Hasenbusch decompositions into products of factors. In the program main/ms1.c, the factorization (if any) can be specified through the input parameter file. NOTE: the layout of the data on the output data file produced by ms1.c had to be changed with respect to openQCD-1.2. - Slightly modified the program main/ms2.c so as to allow for different power-method iteration numbers when estimating the lower and upper end of the spectrum of the Dirac operator (see main/README.ms2). - Removed flags/sf_parms.c since the functionality of this module is now included in lat_parms.c. - Updated documentation files gauge_action.pdf, dirac.pdf, parms.pdf and rhmc.pdf. - In main/ms4.in and main/qcd1.in replaced "[Deflation projectors]" by "[Deflation projection]". - Removed main/qcd2.c since main/qcd1.c now includes the case of SF boundary conditions. - Corrected all check programs in ./devel/* so as to take into account the different choices of boundary conditions. Many check programs now have a command line option -bc that allows the type of boundary condition to be specified at run time. - Corrected a bug in Dwee_dble() [modules/dirac/Dw_dbl.c] that shows up in some check programs if none of the local lattice sizes L1,L2,L3 is divisible by 4. The functionality of the other modules and the main programs in ./main was not affected by this bug, because Dwee_dble() is not called in any of these programs. - Corrected modules/flags/rw_parms.c so as to allow for Hasenbusch factorized reweighting factors. - Corrected and improved the descriptions at the top of many module files. - Corrected devel/ratfcts/INDEX. - Added forgotten "plots" directory in devel/nompi/main. - Replaced &irat in MPI_Bcast(&irat,3,MPI_INT,0,MPI_COMM_WORLD) by irat in flags/force_parms.c [read_forc_parms() and read_force_parms2()]. This is not a mistake but an unnatural and unintended use of the C language. Corrected analogous cases in a number of check programs (thanks to Hubert Simma and Georg Engel for noting these misprints). - Corrected check program block/check1.c (the point labeling does not need to respect any time ordering). 12. May 2013 Version 1.2: 2nd public release. - Added AVX inline-assembly to the time-critical functions (Dirac operator, linear algebra, SAP preconditioner, SU(3) functions). See the README file in the top directory of the distribution. - Added support for blocked MPI process ranking, as is likely to be profitable on parallel computers with mult-core nodes (see main/README.global). - Made the field import/export functions more efficient by avoiding the previously excessive use of MPI_Barrier(). - Added import/export functions for the state of the random number generators. Modified the initialization of the generators so as to be independent of the ranking of the MPI processes. See the notes in modules/random/ranlux.c. Added a check program in devel/random. - Continuation runs of qcd1,qcd2,ym1 and ms1 now normally reset the random number generators to their state at the end of the previous run. The programs initialize the generators in the traditional way if the option -norng is set (see README.qcd1, for example). - Modified the deflated SAP+GCR solver (dfl/dfl_sap_gcr.c) by replacing the deflation projectors through an inaccurate projection in the preconditioner (as suggested by Frommer et al. [arXiv:1303:1377]; the deflation subspace type and subspace generation algorithm are unchanged). This leads to a structural simplification and, after some parameter tuning, to a slight performance gain. NOTE: the deflation parameter set is changed too and the number of status variables is reduced by 1 (see modules/flags/dfl_parms.c, modules/dfl/dfl_sap_gcr.c and doc/parms.pdf). - Included a program (devel/dfl/check4.c) that allows the parameters of the deflated SAP+GCR solver to be tuned on a given lattice. - Deleted the now superfluous module/dfl/dfl_projectors.c. - Added the function fdigits() [utils/mutils.c] that allows double-precision floating point numbers to be printed with all significant decimal digits (and only these). The main programs make use of this function to ensure that the values of the decimal parameters are printed to the log files with as many significant digits as were given on the input parameter file (assuming not more digits were specified than can be represented by a double number). - Replaced "if" by "else if" on line 379 of main/ms2.c. This bug stopped the program with an error message when the CGNE solver was used. It had no effect when other solvers were used. - Changed the type of the variable "sf" to "int" in lines 257 and 440 of forces/force0.c. This bug had no effect in view of the automatic type conversions performed by the compiler. - Corrected sign in line 174 of devel/sap/check2.c. This bug led to wrong check results, thus incorrectly suggesting that the SAP modules were incorrect. - Corrected a mistake in devel/tcharge/check2.c and devel/tcharge/check5.c that gave rise to wrong results suggesting that the tested modules were incorrect. 14. June 2012 Version 1.0: Initial public release.