View on GitHub
Open this notebook in GitHub to run it yourself
QFold: Quantum Walks and Deep Learning to Solve Protein Folding
Introduction
Protein folding is an important problem at the foundation of biological phenomena. If we could predict how a sequence of amino acids folds into a three-dimensional structure, it would greatly contribute to drug design and understanding diseases. However, the folding problem is extremely difficult in computational science.- Structure space grows explosively
- Each amino-acid residue has dihedral angles ψ and φ, and their combinations generate countless possible spatial structures.
- For example, if φ and ψ are each discretized into several dozen levels, the number of possible structures grows exponentially with respect to the number of residues N.
- Limitations of classical computation
- Classically, the search is performed using molecular dynamics (MD) simulations or Monte Carlo (MC) methods.
- In the MC method in particular, the procedure “compute energy difference ΔE → determine whether to accept the new structure using the classical Metropolis rule” is repeated.
- However, because many energy minima exist, classical searches tend to fall into local optima, and large-scale exploration is computationally difficult.
- Potential of quantum computation
- Quantum computers can make use of superposition (parallelism) and amplitude amplification.
- In particular, the Szegedy-type quantum walk is a quantized version of a classical Markov chain, and the paper shows the possibility of accelerating the Metropolis method. In other words, by representing local moves with a quantum walk, it is expected that the structure space can be explored more efficiently.
Method: QFold, A protein-folding method based on quantum walks
In the proposed method, there is a classical processing part (Initialization module) that uses deep learning, and a quantum/classical Metropolis method (Simulation module) that executes quantum computation based on the obtained classical processing. The combination of this classical processing part and the quantum computation part is called QFold. The quantum computation part alone is referred to as the quantum Metropolis method. The quantum Metropolis method aims to perform an approximate exploration of the energy landscape by embedding an update rule similar to the Metropolis–Hastings method into a quantum circuit based on the Szegedy-type quantum walk.- Classical processing [2]
- Quantum encoding
- Rotation-angle qubits: encode the discretized angles φ and ψ
- Move-id register: selects the dihedral angle to be updated
- Coin qubit: encodes the Metropolis acceptance probability
- Coin preparation and acceptance probability
- Measurement and interpretation of the results
Dataset
The dataset used in this study is a table of energy differences ΔE that were precomputed by classical computation (e.g., Psi4) after restricting the protein folding problem to a finite number of discrete states. In this notebook, the energy set by Minifold is not computed; instead, the dataset described in reference [2] is used. Each state is uniquely represented by a binary string of five types of bits, and each bit is assigned the following meaning. The dataset stores “the ΔE obtained when a state (φ, ψ) is changed according to the move-id.”| Quantum register | Physical meaning | Notes |
|---|---|---|
| register | Current value of the dihedral angle (0 = 0°, 1 = ) | Rotational degree of freedom of the protein backbone |
| register | Current value of the dihedral angle (0 = 0°, 1 = ) | The other main dihedral angle |
| register | move-id (0 → change / 1 → change ) | Specifies which angle to move |
| move-val | (0 = , 1 = ) | In the 1-bit discretization this is always 1 (fixed at +π) |
| coin (auxiliary) | coin placeholder | The actual coin qubit is generated inside the circuit; the input is always 0 |
- (the dataset always uses 1, so fixed to 1)
- coin placeholder
Loading Dataset
Output:
Output:
Implementation of walk operator
Coin Preparation
Following Algorithm 1, for each state , is read out and the corresponding is computed. For example, in the case (updating ), is obtained for the current structure with and , and by taking its average, is determined. This is converted into , and a controlled rotation is applied to the coin register. Similarly, when , for updating is computed and applied to the coin register. If the coin register is 1, the or register is flipped according to the corresponding move-id. This corresponds to accepting the new structure. After that, the coin rotation is uncomputed to erase the auxiliary information, and by adding the oracle operator, the amplitude of states corresponding to low-energy structures is amplified. In QFold (particularly in the quantum circuit of Fig. 6), the reason the oracle is applied to is that at every step of the walk, the auxiliary registers for coin and move are always reset to . In other words, since all update processes (the accept/reject decision) proceed starting from that state, the oracle operator for selecting “good structures” is also defined with respect to . Below, the actions of the coin operator and the oracle operator are explained. First, write the state as : Next, a rotation is given to the coin based on . For example, when (updating ): Here, . Next, the coin operator is applied (Compute). If , is flipped: If , nothing is done. Next, the Uncompute of the coin operator is performed. Then, by reversing the coin rotation, is reset to : As a result, the information of the rotation angle (which rotation angle was applied) remains only in , and return again to . Finally, by applying the oracle operator to the subspace , the information of the acceptance probability alone is marked: Therefore, by alternately applying the coin operator, the oracle operator, and the shift operator at each walk step, the information including the acceptance probability is amplified, and by repeatedly implementing the quantum walk, the information reflecting the optimal structure is obtained. More detailed information is described in [1]. In Algorithm 1 of the paper, in constructing the coin operator, energy-difference values that are close to each other are used. For , “00001” and “01001” are used, and for , “00101” and “10101” are used. One might think that it is not necessary to include other information, but by using only the information of and , other rotation information (for example, ) can be included through quantum parallelism arising from superposition. This can be expressed by the following code.Output:
bulding blocks
Output:
Output:
Output:
01101 or 01100 has close to minimum energy of given protein.