US7885796B2 - Parallel calculation method and device - Google Patents
Parallel calculation method and device Download PDFInfo
- Publication number
- US7885796B2 US7885796B2 US10/572,745 US57274506A US7885796B2 US 7885796 B2 US7885796 B2 US 7885796B2 US 57274506 A US57274506 A US 57274506A US 7885796 B2 US7885796 B2 US 7885796B2
- Authority
- US
- United States
- Prior art keywords
- density
- matrix
- computers
- submatrixes
- calculation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 105
- 239000011159 matrix material Substances 0.000 claims abstract description 132
- 238000000034 method Methods 0.000 claims abstract description 58
- 230000010354 integration Effects 0.000 claims abstract description 15
- 238000004219 molecular orbital method Methods 0.000 claims abstract description 10
- 238000003078 Hartree-Fock method Methods 0.000 claims abstract description 5
- 238000012546 transfer Methods 0.000 claims description 63
- 238000003860 storage Methods 0.000 claims description 32
- 238000004776 molecular orbital Methods 0.000 claims description 21
- 238000004891 communication Methods 0.000 claims description 12
- 238000004965 Hartree-Fock calculation Methods 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 5
- 239000012620 biological material Substances 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 230000000704 physical effect Effects 0.000 claims description 4
- 238000004088 simulation Methods 0.000 claims description 4
- 239000000126 substance Substances 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 2
- 230000014759 maintenance of location Effects 0.000 claims 7
- 101000802640 Homo sapiens Lactosylceramide 4-alpha-galactosyltransferase Proteins 0.000 description 14
- 102100035838 Lactosylceramide 4-alpha-galactosyltransferase Human genes 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000004593 restricted open-shell Hartree–Fock calculation Methods 0.000 description 4
- 238000012601 sigmoid curve-fitting method Methods 0.000 description 4
- 238000004774 atomic orbital Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 229920001222 biopolymer Polymers 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 239000013543 active substance Substances 0.000 description 1
- -1 biological materials Chemical class 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 150000002605 large molecules Chemical class 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 238000012900 molecular simulation Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/13—Differential equations
Definitions
- the present invention relates to a parallel computing method and system suitable for molecular simulation or the like, and in particular relates to a parallel computing method and system suitable for molecular orbital calculation by the RHF (Restricted Hartree-Fock) method.
- RHF Restricted Hartree-Fock
- HF Hartree-Fock
- F and S are given as N ⁇ N matrixes
- C is given as an M ⁇ N matrix
- ⁇ is an M-dimensional vector
- N is the number of atomic orbitals (AO)
- M is the number of molecular orbitals (MO) in a molecule under analysis.
- F is called a Fock matrix, and is given as Eq. (2).
- ⁇ represents the summation for all the molecular orbitals occupied by electron.
- tu) are physical quantities called overlap integral, H-core integral and two-electron integral, respectively, and are represented with atomic orbits ⁇ r as in Eqs. (4) to (6).
- Eq. (1) is given in the form of a linear equation, but the Fock matrix F is determined depending on atomic orbitals ⁇ i , so that it is actually a non-linear equation, which cannot be solved exactly.
- SCF Self-consistent filed
- JP-A-2000-293494 Japanese Patent Application Laid-open No. 2001-312485 (JP-A-2001-312458) and Japanese Patent Application Laid-pen No. H9-50428 (JP-A-9-050428).
- Any of these methods uses a single host machine such as a vector computer for executing matrix calculation or the like, while using parallel computers or a cluster of computers to execute calculation of Fock matrixes F or calculation of two-electron integrals.
- Patent document 1 JP-A-2000-293494
- Patent document 2 JP-A-2001-312485
- Needed secondly is to establish a technique to reduce the amount of computation in the above method.
- no method has been known when all the matrix elements are stored in distributed memory areas as described above. It is possible to realize speedup if the number of machines coupled in parallel is increased, and reduction, if possible, in the amount of computation makes further improvement possible, hence establishment of such a technique is important.
- a parallel computing method of the present invention includes the steps of: using a computer cluster made up of a plurality of computers; dividing a density matrix into multiple density submatrixes and distributing the multiple density submatrixes to the multiple computers and storing therein; and executing calculation processes on the density submatrixes in each of the computers while sequentially transferring the multiple density submatrixes between the multiple computers.
- a parallel computing system of the present invention is a parallel computing system for executing calculation of the Hartree-Fock method in a molecular orbital method, includes a computer cluster made up of a plurality of computers, wherein each of the computers includes: a matrix storage for storing density submatrixes which are divided from a density matrix; a transfer controller for performing transmission and reception of the density submatrixes with respect to the other computers in the computer cluster; and a calculation processor for performing a calculation on the density submatrix stored in the matrix storage, and wherein calculation processes on the density submatrixes are executed in each of the computers while the multiple density submatrixes are being sequentially transferred between the multiple computers.
- the present invention relates to a calculating method of the Hartree-Fock method in which the density matrix is divided into density submatrixes, to enable generation of a Fock matrix by sequentially transferring the divided density submatrixes, reduce the amount of computation by increasing the work areas and by combination of different transfer orders, and reduce the amount of transfer by skipping one step in the transfer method. Since the method of the present invention is a method in which the scale of calculation is only dependent on the memory capacity of the whole system, it is possible to achieve large-scale calculation and reduce computational time by connecting a large number of computers in parallel.
- FIG. 2 is a block diagram showing a logical configuration of each computer.
- FIG. 3 is a flowchart showing an algorithm for generating a Fock matrix when a density matrix is divided, based on the parallel computing method of the present invention.
- FIG. 4 is a chart showing the order of transfer to a region (RS) (matrix J(RS)) in a case with a node number of 4.
- FIG. 5 is a chart showing the order of transfer to a region (TU) (matrix D(TU)) in a case with a node number of 4.
- FIG. 8 is a rewritten chart of the order of transfer shown In FIG. 6 , for easy understanding.
- FIG. 9 is a chart for illustrating the presence or absence of calculation of two-electron integrals when two-electron integral (RS
- FIG. 10 is a chart for illustrating the presence or absence of calculation of two-electron integrals when two-electron integral (RS
- TU two-electron integral
- a computer cluster or a system of distributed memory parallel computers shown in FIG. 1 is presumed.
- the computer system in one embodiment of the present invention shown in FIG. 1 includes a plurality of computers having equivalent capabilities coupled via communication devices. Since the conventional molecular orbital computing algorithm needs a large amount of memory, in most cases high-performance computers such as so-called super computers etc., have been used as the host computer. Since a high-performance computer is generally high in price, it sometimes entails the cost problem. Use of the method of the present invention, however, does not need such a host computer, it is hence possible to cut down the cost. In the example shown in FIG.
- each computer is composed of, as shown in FIG.
- Step 104 the density density submatrix is transferred from node PC 1 to node PC 2 . Since another density submatrix is sent to node PC 1 from another node, node PC 1 stores the received submatrix into the matrix storage, instead of the density submatrix that has been transferred to node PC 2 . Node PC 2 transfers the density submatrix stored therein to still another node.
- the parallel computing method of the present invention is not limited to the above.
- the present invention can be implemented in various embodiment modes, as will be described hereinbelow, by selecting submatrixes to be transferred and transfer modes. In the description hereinbelow, it is assumed that the number of nodes is 4 and the number of submatrixes into which a matrix is divided is 4, for description simplicity.
- Fock matrix F is defined as Eq. (7).
- the nodes are named P 11 , P 21 , P 12 and P 22 , and the submatrixes are allotted to each node as shown in Eq. (11).
- the state in which the matrixes are thus allotted will be called initial distribution state (S- 0 ).
- Eq. (7) should be calculated part by part every time density matrix D is transferred.
- the condition for making this method succeed is that each node should have a different submatrix of matrix D from the others at any time.
- Transfer should be caused to occur next time only at nodes P(i, j) at which this condition holds.
- Eq. (21) is called the transfer condition.
- a computing algorithm for realizing these transfers includes the following steps:
- Matrixes J 1 (RS) and D(RS) are transferred in accordance with FIG. 4 , matrixes D(TU) and J 2 (TU) in accordance with FIG. 5 , matrixes K 1 (RU) and D(RU) in accordance with FIG. 6 , and matrixes D(TS) and K 2 (TS) in accordance with FIG. 7 . That is, the way of transfer is different from each other depending on the regions (RS), (TU), (RU) and (TS).
- numerals m and n are rewritten as Eq. (27).
- J 1 , J 2 , K 1 and K 2 are calculated as follows.
- node numbers are renamed as in Eq. (30).
- condition (c4) it is possible to perform calculation by reversing the order of transfer or by performing the last transfer first.
- the above algorithm may and should be executed in the order Of [9], . . . , [15], [1], . . . , [8].
- the different kinds of cyclical density matrix schemes described above can be applied to computation for a case with the number of nodes being N and the number of division being M (>N). In this case it is necessary to divide each matrix in the same manner. That is, the number in region R and that in region T need to correspond to each other, and the number in region S and that in region U need to correspond to each other.
- the “quadruple cyclical density matrix scheme, Type-2” can be generalized into the following algorithm, including the steps of:
- condition (c-4) holds, it is possible to perform calculation by reversing the order of transfer or by performing the first transfer by a single step only.
- each computer as a constituent of the computer cluster has to be one that executes the aforementioned process at each node.
- Each computer loads a computer program for realizing the processes as a node and executes the program so as to perform each of the aforementioned processes.
- Such a program is loaded by the computer from a recording medium such as magnetic tape and CD-ROM, or via a network.
- scope of the present invention also includes program products of the above program, mechanically readable recording media storing this program and transmission media for transmitting this program.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Operations Research (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Complex Calculations (AREA)
Abstract
Description
FC=SCε (1)
where, D is called a density matrix and defined with MO coefficients C as in Eq. (3).
where, “occ” in symbol Σ represents the summation for all the molecular orbitals occupied by electron. S, H, (rs|tu) are physical quantities called overlap integral, H-core integral and two-electron integral, respectively, and are represented with atomic orbits φr as in Eqs. (4) to (6).
where, “core” in symbol I represents the summation for the nucleus. Incidentally, Eq. (1) is given in the form of a linear equation, but the Fock matrix F is determined depending on atomic orbitals φi, so that it is actually a non-linear equation, which cannot be solved exactly. To solve this equation, a technique called SCF(Self-consistent filed) method is used.
R1, S1, T1, U1=1, . . . , N/2, R2, S2, T2, U2=N/2+1, . . . , N, (8)
When these are represented in an integrated manner, they can be represented as in Eq. (10).
P(1)=P11, P(2)=P21, P(3)=P12, P(4)=P22 (15)
where, J and K are matrixes presenting summations of Coulomb integrals and exchange integrals, respectively. This can be written down for every region and every node in a similar manner as for the case of the cyclical density matrix scheme, and Eqs. (17) and (18) are obtained.
where, two-electron integrals (rs|tu) calculated at each node are arranged so that they are equal to each other for every distribution states of J and K. In other words, they are arranged so that calculation of two-electron integrals (rs|tu) can be done at a time only. For this purpose, submatrixes of K should also be transferred. Transfers of matrixes J(RS), D(TU), K(RU) and D(TS) are done as shown in
P(1,1)=P11, P(2,1)=P21, P(1,2)=P12, P(2,2)=P12 (19)
t=m/j max (20)
i=(n mod t)+1 (21)
m=(s−1)N+r, n=(u−1)N+t, (23)
d(m,n)=1 for m≠n, ½ for m=n (24)
m=(j−1)i max +i(i,j={1,2}), n=(l−1 )k max +k(k,l={a,b}) (27)
where, i and j denote the numbers {1, 2} in regions R and S, k and l denote the numbers {a, b} in regions T and U. Here, a and b are converted so that a=1 and b=2. There are several possible conditions using m and n to perform calculations by dividing matrix J into J1 and J2 and matrix K into K1 and K2.
m≧n Condition (C-1)
m≦n Condition (C-2)
m=n, m+n=odd for m<n,, m+n=even for m>n Condition (C-3)
m=n, m+n=odd for m>n,, m+n=even for m<n Condition (C-4)
Claims (10)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2003330290 | 2003-09-22 | ||
JP2003-330290 | 2003-09-22 | ||
PCT/JP2004/013734 WO2005029352A1 (en) | 2003-09-22 | 2004-09-21 | Parallel calculation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
US20060271301A1 US20060271301A1 (en) | 2006-11-30 |
US7885796B2 true US7885796B2 (en) | 2011-02-08 |
Family
ID=34372986
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/572,745 Expired - Fee Related US7885796B2 (en) | 2003-09-22 | 2004-09-21 | Parallel calculation method and device |
Country Status (3)
Country | Link |
---|---|
US (1) | US7885796B2 (en) |
JP (1) | JP4612546B2 (en) |
WO (1) | WO2005029352A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1811413A4 (en) * | 2004-09-27 | 2008-02-20 | Japan Science & Tech Agency | MOLECULAR ORBIT DATA PROCESSING DEVICE FOR AN ELONGATION PROCESS |
US8065503B2 (en) * | 2006-12-15 | 2011-11-22 | International Business Machines Corporation | Iteratively processing data segments by concurrently transmitting to, processing by, and receiving from partnered process |
KR101472417B1 (en) * | 2013-07-09 | 2014-12-12 | 주식회사 엘지화학 | Method for analysis of characteristics of molecular orbital using assembly of consecutive buliding block and system using the same |
US20160162625A1 (en) | 2013-09-26 | 2016-06-09 | Synopsys, Inc. | Mapping Intermediate Material Properties To Target Properties To Screen Materials |
WO2015048532A1 (en) | 2013-09-26 | 2015-04-02 | Synopsys, Inc. | Parameter extraction of dft |
US10489212B2 (en) | 2013-09-26 | 2019-11-26 | Synopsys, Inc. | Adaptive parallelization for multi-scale simulation |
WO2015048509A1 (en) | 2013-09-26 | 2015-04-02 | Synopsys, Inc. | First principles design automation tool |
US10516725B2 (en) | 2013-09-26 | 2019-12-24 | Synopsys, Inc. | Characterizing target material properties based on properties of similar materials |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6158076A (en) | 1984-08-29 | 1986-03-25 | Sumitomo Electric Ind Ltd | Simultaneous primary equation simulator |
JPS63238653A (en) | 1986-11-27 | 1988-10-04 | Nippon Telegr & Teleph Corp <Ntt> | Data processor and its processing method |
JPH0950428A (en) | 1995-08-10 | 1997-02-18 | Hitachi Ltd | Calculation system for molecular orbital analysis |
JPH0950458A (en) | 1995-08-10 | 1997-02-18 | Hitachi Ltd | Two-electron integral calculation method for molecular orbitals |
JP2000020501A (en) | 1998-07-03 | 2000-01-21 | Toshiba Corp | Parallel computer system and communication method between arithmetic processing units |
JP2000293494A (en) | 1999-04-09 | 2000-10-20 | Fuji Xerox Co Ltd | Parallel computing device and parallel computing method |
JP2000298658A (en) | 1999-04-13 | 2000-10-24 | Fuji Xerox Co Ltd | Parallel processing method and parallel processing device |
JP2000305923A (en) | 1999-04-21 | 2000-11-02 | Fuji Xerox Co Ltd | Matrix element parallel calculation method and molecular orbital calculation method |
JP2001312485A (en) | 2000-04-28 | 2001-11-09 | Fuji Xerox Co Ltd | Job assignment method and parallel processing method in parallel processing method |
JP2003099408A (en) | 2001-09-25 | 2003-04-04 | Japan Science & Technology Corp | Parallel calculation method |
-
2004
- 2004-09-21 WO PCT/JP2004/013734 patent/WO2005029352A1/en active Application Filing
- 2004-09-21 US US10/572,745 patent/US7885796B2/en not_active Expired - Fee Related
- 2004-09-21 JP JP2005514078A patent/JP4612546B2/en not_active Expired - Fee Related
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6158076A (en) | 1984-08-29 | 1986-03-25 | Sumitomo Electric Ind Ltd | Simultaneous primary equation simulator |
JPS63238653A (en) | 1986-11-27 | 1988-10-04 | Nippon Telegr & Teleph Corp <Ntt> | Data processor and its processing method |
JPH0950428A (en) | 1995-08-10 | 1997-02-18 | Hitachi Ltd | Calculation system for molecular orbital analysis |
JPH0950458A (en) | 1995-08-10 | 1997-02-18 | Hitachi Ltd | Two-electron integral calculation method for molecular orbitals |
JP2000020501A (en) | 1998-07-03 | 2000-01-21 | Toshiba Corp | Parallel computer system and communication method between arithmetic processing units |
US6631391B1 (en) | 1999-04-09 | 2003-10-07 | Fuji Xerox Co., Ltd. | Parallel computer system and parallel computing method |
JP2000293494A (en) | 1999-04-09 | 2000-10-20 | Fuji Xerox Co Ltd | Parallel computing device and parallel computing method |
JP2000298658A (en) | 1999-04-13 | 2000-10-24 | Fuji Xerox Co Ltd | Parallel processing method and parallel processing device |
US6799151B1 (en) | 1999-04-13 | 2004-09-28 | Taisho Pharmaceutical Co., Ltd | Method and apparatus for parallel processing |
JP2000305923A (en) | 1999-04-21 | 2000-11-02 | Fuji Xerox Co Ltd | Matrix element parallel calculation method and molecular orbital calculation method |
JP2001312485A (en) | 2000-04-28 | 2001-11-09 | Fuji Xerox Co Ltd | Job assignment method and parallel processing method in parallel processing method |
US20030204576A1 (en) | 2000-04-28 | 2003-10-30 | Sou Yamada | Method for assigning job in parallel processing method and parallel processing method |
JP2003099408A (en) | 2001-09-25 | 2003-04-04 | Japan Science & Technology Corp | Parallel calculation method |
US20040260529A1 (en) | 2001-09-25 | 2004-12-23 | Toshikazu Takada | Parallel calculation method |
Also Published As
Publication number | Publication date |
---|---|
WO2005029352A1 (en) | 2005-03-31 |
JP4612546B2 (en) | 2011-01-12 |
US20060271301A1 (en) | 2006-11-30 |
JPWO2005029352A1 (en) | 2006-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Iterative configuration interaction with selection | |
US6766342B2 (en) | System and method for computing and unordered Hadamard transform | |
CN111582491B (en) | Quantum circuit construction method and device | |
Aaronson et al. | Improved simulation of stabilizer circuits | |
Grigori et al. | CALU: a communication optimal LU factorization algorithm | |
CN111563599B (en) | Quantum circuit decomposition method and device, storage medium and electronic device | |
Ismail et al. | Multiresolution analysis in statistical mechanics. I. Using wavelets to calculate thermodynamic properties | |
US20090125461A1 (en) | Multi-Label Active Learning | |
CN114764549B (en) | Quantum circuit simulation calculation method and device based on matrix product state | |
Rhodes et al. | Identifiability of large phylogenetic mixture models | |
CN113222150B (en) | Quantum state transformation method and device | |
CN113850389B (en) | Quantum circuit construction method and device | |
CN114528996B (en) | Method, device and medium for determining initial parameters of target system test state | |
CN114492814A (en) | Method, device and medium for simulating energy of target system based on quantum computation | |
US7885796B2 (en) | Parallel calculation method and device | |
CN114511094B (en) | Quantum algorithm optimization method and device, storage medium and electronic device | |
CN114519429A (en) | Method, apparatus and medium for obtaining observability of target system | |
CN114692880B (en) | Quantum state amplitude simulation method and device in quantum circuit | |
US6799151B1 (en) | Method and apparatus for parallel processing | |
CN113222151B (en) | Quantum state transformation method and device | |
Jamal et al. | A hybrid CPU/GPU approach for the parallel algebraic recursive multilevel solver pARMS | |
CN117010516A (en) | Method and device for determining Hamiltonian amount of target system | |
CN115907016B (en) | Quantum-based method for calculating search target range value and related device | |
CN116052759A (en) | Hamiltonian volume construction method and related device | |
CN116090568A (en) | Method and device for determining size relation between quantum data and classical floating point data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TAKADA, TOSHIKAZU;YAMAMOTO, JUN-ICHI;NAKATA, KAZUTO;REEL/FRAME:017809/0458 Effective date: 20060317 Owner name: NEC SOFT, LTD., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TAKADA, TOSHIKAZU;YAMAMOTO, JUN-ICHI;NAKATA, KAZUTO;REEL/FRAME:017809/0458 Effective date: 20060317 |
|
AS | Assignment |
Owner name: NEC SOFT, LTD., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC CORPORATION;REEL/FRAME:023747/0014 Effective date: 20100106 |
|
AS | Assignment |
Owner name: NEC SOLUTION INNOVATORS, LTD., JAPAN Free format text: CHANGE OF NAME;ASSIGNOR:NEC SOFT, LTD.;REEL/FRAME:033290/0523 Effective date: 20140401 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20150208 |