Requirements
- matplotlib
- numpy
- scipy
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
The wormlike chain model¶
The wormlike chain model is a simple thermodynamic model of a polymer chain. It is commonly used to study the properties of DNA and other biological macromolecules that tend to become straight at low temperatures. A wormlike chain describes a polymer as a series of bonds joined end-to-end. A bond a position consists of a fixed-length tangent vector that points along the chain. At low temperature, the chain adopts a straight conformation, in which the tangent vectors are parallel to each other.
The energy of the wormlike chain is given by the sum of the inner products of adjacent bonds on the chain.
where is a physical parameter that characterizes the bending stiffness of the chain. This can be seen as torsional analogue of the spring constant of a linear chain of springs. At low temperature, all of the tangent vectors are parallel to each other, and the energy is zero. However, as the temperature is increased, the chain will adopt more random conformations, and the energy will increase. At very high temperatures, the chain resembles a random walk. The wormlike chain model thus qualitatively resembles a Heisenberg spin system with nearest-neighbor interactions, with the tangent vectors playing the role of spins.
We use the notation to represent the entire chain’s configuration at a given time.
Properties of the wormlike chain¶
A key parameter of the wormlike chain is the persistence length, which is the length over which the tangent vectors remain correlated. The persistence length is given by
where is the bond length and is the inverse temperature. At a given temperature, the persistence length determines the typical length over which the chain’s tangent vectors remain correlated. Qualitatively, it measures the chain’s stiffness in terms of the typical size scales over which it appears to be straight. In the same manner that the correlation length of a spin system diverges at low temperatures, the persistence length diverges at high temperatures and long chains. For finite chains, this means that the persistence length approaches the chain length .
The correlation function between the tangent vectors at two points and separated by a distance is thus given by
where is the separation between the two points and the average is taken over many configurations of the chain.
Another key property of the wormlike chain is the end-to-end distance, describing the distance between the first and last points of the chain. Intuitively, we expect this distance to decrease with increasing temperature, as the chain becomes more flexible and the ends can come closer together. Conversely, this quantity should approach the system size in the limit of low temperatures.
Analytically, the average end-to-end distance can be computed by taking the large- and computing an integral over the chain length coordinate . This results in the expression
where is the end-to-end vector and is the total length of the chain when fully extended.
Implementing the Wormlike chain model¶
We are now going to implement the wormlike chain model, and use Monte Carlo simulations to sample its equilibrium properties and verify the two analytical results above. Our implementation will first define a new object Bond
representing a unit-length tangent vector in 3D space. This object will essentially wrap a numpy array of shape (3,) with a unit norm. However, it will also provide a rotate_in_place
method that rotates the vector in place by an arbitrary angle about an arbitrary axis. To do this, it implements the Rodrigues rotation formula.
Our WormlikeChainMC
class will use an ordered list of Bond
objects to represent the chain’s unit tangent vectors. This is stored in the bonds
attribute. The constructor initializes all of the bonds in the chain to be straight, with small random transverse displacements.
In order to map this problem onto a Monte Carlo algorithm, we need to (1) define a function that sweeps through the chain and attempts to rotate bonds about a random axis by a random angle, and (2) define a function that computes the chain’s bending energy before and after the proposed move, and then performs a Metropolis acceptance test to determine whether to accept the proposed move.
Recall that, for a Metropolis algorithm, we compute the initial energy of a given chain configuration . We then propose a new configuration and calculate its energy . We then compute the change in energy . If , we accept the proposed move. Otherwise, we accept the move with probability , where is the inverse temperature.
In order to encourage our chain to equilibrate faster, we will add some structure to the proposed moves. Rather than attempting to update all bonds at once to generate a proposed new configuration , we will attempt to update one randomly-selected bond at a time. As you will see below, this simplies this calculation of change in energy . The process of selecting bonds at random to update is known as a “sweep”.
In our implementation, the proposed move will consist of rotating a single bond about a random axis by a random angle, using the rotate_in_place
method of the Bond
object.
To Do¶
Please complete the following tasks and answer the included questions. You can edit a Markdown cell in Jupyter by double-clicking on it. To return the cell to its formatted form, press [Shift]+[Enter]
.
- Suppose you have two chains, and , that differ only in the orientation of single bond, at site . Calculate (analytically) a minimal expression for the change in the energy between the two chains. Using this result, implement the
delta_energy
method in theWormlikeChainMC
class that takes the current chain configuration, an index , and a proposed new bond . The function should compute the change in energy between the current and proposed chain.
Your Answer: complete the code below
- Implement the
sweep
method in theWormlikeChainMC
class. A full sweep of the chain consists of visiting each bond individually and attempting to rotate it about a random axis by a random angle, and then computing the change in energy of the proposed chain based on the entire chain’s energy. After computing the change in energy from the proposed move, the method should accept or reject the move using a Metropolis acceptance test that depends on the inverse temperature . The method should then proceed to the next bond, until all bonds have been attempted. Test your implementation using the code below to check that the chain configuration updates as expected.
Your Answer: complete the code below
- Using the code provided below, compare the analytical and Monte Carlo results for the chain’s end-to-end distance and tangent correlation function. What causes the discrepancies?
Your Answer:
- In our implementation, we performed the Monte Carlo update by picking random bonds in the chain and running a Metropolis acceptance test separately. However, this introduces a subtle non-physical bias in the sampling of states. What is the origin of this bias? Why do you think we opted to sweep through the chain, instead of updating all bonds at once and doing a single Metropolis acceptance test on the entire chain?
Your Answer:
- An DNA strand consists of a sequence of nucleotides resembling a polymer. When part of a DNA strand becomes looped within a cell, it becomes harder for the cell’s machinery to access the looped region and create RNA and eventually proteins. Looping thus represents a physical mechanism that can control the expression of genes within a cell. Which physical property of the wormlike chain model would you expect to control whether not a given part of the DNA sequence is looped? For more information on this problem, see Geggier & Vologodskii, 2010.
Your Answer:
import numpy as np
def random_unit_vec():
"""Return a random unit vector on S^2 (shape (3,))."""
z = np.random.uniform(-1.0, 1.0)
phi = np.random.uniform(0.0, 2.0 * np.pi)
r = np.sqrt(max(0.0, 1.0 - z * z))
return np.array([r * np.cos(phi), r * np.sin(phi), z], dtype=float)
class Bond:
"""Unit-length bond/tangent vector in R^3 with in-place rotations.
Parameters:
v (np.ndarray): Initial vector (shape (3,)); will be normalized.
Attributes:
v (np.ndarray): Unit vector (shape (3,)).
Methods:
rotate_in_place(axis: np.ndarray, angle: float): Rotate about an
arbitrary axis by Rodrigues' formula; re-normalize.
copy(): Return a copy of the Bond.
"""
def __init__(self, v):
v = np.asarray(v, dtype=float).reshape(3)
n = np.linalg.norm(v)
if n == 0.0:
raise ValueError("Zero vector is not allowed for Bond.")
self.v = v / n
def rotate_in_place(self, axis: np.ndarray, angle: float):
"""
Given an arbitrary axis and angle, rotate a vector in place by Rodrigues'
formula, and then re-normalize.
Args:
axis (np.ndarray): Axis of rotation (shape (3,)).
angle (float): Angle of rotation (radians).
Returns:
None
"""
a = np.asarray(axis, dtype=float).reshape(3)
na = np.linalg.norm(a)
if na == 0.0:
# Degenerate axis: no rotation
return
a /= na
c, s = np.cos(angle), np.sin(angle)
v = self.v
# Rodrigues rotation
v_rot = v * c + np.cross(a, v) * s + a * (np.dot(a, v)) * (1.0 - c)
# Guard against FP drift
self.v = v_rot / np.linalg.norm(v_rot)
# Numpy interop: treat Bond like its vector
def __array__(self, dtype=None):
return np.asarray(self.v, dtype=dtype)
def copy(self):
return Bond(self.v.copy())
class WormlikeChainMC:
"""Discrete 3D wormlike chain with Metropolis updates in tangent space.
The chain is represented by N unit bond vectors (Bond objects) of fixed length b.
Positions are reconstructed by r_{k+1} = r_k + b t_k (with r_0 at the origin).
Bending energy: E = kappa * sum_i (1 - t_i · t_{i+1}).
Parameters:
N (int): Number of bonds.
b (float): Bond length (segment length).
kappa (float): Bending stiffness (k_B T units). Persistence length l_p = kappa * b.
beta (float): Inverse temperature 1/T (k_B=1).
random_state (int | None): Random state for numpy.random.
Attributes:
N (int): Number of bonds.
b (float): Bond length.
kappa (float): Bending stiffness.
beta (float): Inverse temperature.
bonds (list[Bond]): List of N Bond instances (unit vectors).
"""
def __init__(self, N=400, b=1.0, kappa=20.0, beta=1.0, random_state=None):
self.N, self.b = int(N), float(b)
self.kappa, self.beta = float(kappa), float(beta)
self.random_state = random_state
np.random.seed(random_state)
## start nearly straight with small transverse noise, then normalize per Bond
base = np.tile(np.array([1.0, 0.0, 0.0], dtype=float), (self.N, 1))
base += 0.05 * np.random.normal(size=base.shape)
self.bonds = [Bond(v) for v in base]
## calculate the persistence length
self.lp = self.kappa * self.b * self.beta
def tangent_vectors(self):
"""Stack current bond vectors into a (N, 3) ndarray."""
return np.vstack([b.v for b in self.bonds])
def positions(self):
"""Return positions r_k, k=0..N as shape (N+1,3) with r_0 = 0."""
r = np.zeros((self.N + 1, 3), dtype=float)
np.cumsum(self.b * self.tangent_vectors(), axis=0, out=r[1:])
return r
def energy(self):
"""Compute the total bending energy of the chain."""
t = self.tangent_vectors()
dots = (t[:-1] * t[1:]).sum(axis=1)
return float(self.kappa * np.sum(1.0 - dots))
def _local_deltaE(self, i: int, trial_vec: np.ndarray):
"""Energy change from replacing t_i by trial_vec (only neighbors matter)."""
################################################################################
#
#
# YOUR CODE HERE
#
# Be sure to watch out for boundary conditions on the chain.
#
#
################################################################################
raise NotImplementedError("Implement this method")
def sweep(self, step_size: float = 0.3):
"""
One Monte Carlo sweep consists of attempting N local rotations, one for each
bond on the chain.
Args:
step_size (float): Typical rotation angle (radians) for proposals.
Returns:
dict: {'acc': float, 'E': float} The fraction of moves that were accepted
and the final energy of the chain after the sweep.
"""
################################################################################
#
#
# YOUR CODE HERE
#
# Your implementation should visit each bond once, propose a new bond, calculate
# the change in energy, and then accept or reject the proposed move using a
# Metropolis acceptance test. Be sure to use the copy method of the Bond object
# to propose a new bond, to avoid modifying the original bond in place.
#
# Your implementation should keep a counter of the number of accepted moves, so
# that you can return the acceptance rate as part of the return value. This is
# useful for diagnosing the hyperparameters of the Monte Carlo simulation.
#
# You should also compute the final energy of the chain after the sweep, so
# that you can return that as part of the return value.
#
################################################################################
raise NotImplementedError("Implement this method")
def end_to_end(self):
"""Return end-to-end vector R and its squared length."""
R = self.b * self.tangent_vectors().sum(axis=0)
return R, float(np.dot(R, R))
def tangent_correlation(self):
"""Compute C(s) = <t_i · t_{i+s}> averaged over i.
Returns:
(np.ndarray, np.ndarray): separations s, correlations C(s).
"""
max_sep = (self.N - 1) // 4 # Largest separation is 1/4 of chain length
s_vals = np.arange(0, max_sep + 1, dtype=int)
t = self.tangent_vectors()
C = np.zeros_like(s_vals, dtype=float)
for si, s in enumerate(s_vals):
dots = (t[: self.N - s] * t[s:]).sum(axis=1)
C[si] = float(np.mean(dots))
return s_vals, C
def simulate(self, n_eq, step_size=0.25):
"""
Simulate the wormlike chain and return self
Args:
n_eq (int): Number of of sweeps to equilibrate the chain.
step_size (float): Proposal rotation angle scale (radians).
Returns:
dict: {'E': np.ndarray, 'R2': np.ndarray, 'acc': np.ndarray, 't_corr': (s, C), 'chain': WormlikeChainMC}
"""
for _ in range(n_eq):
self.sweep(step_size=step_size)
return self
Test and use your code¶
You don’t need to write any new code below, run these cells to confirm that everything is working and to play with your implementation. We can start by running a few sweeps of the chain to see that our code is working.
- If you are working from a local fork of the entire course, then you already have access to the solutions. In this case, make sure to
git pull
to make sure that you are up-to-date (save your work first). - If you are working from a single downloaded notebook, or are working in Google Colab, then you will need to manually download the solutions file from the course repository. The lines below will do this for you.
## Optional: Download and import the instructor solutions
import requests
ns={}
exec(requests.get('https://raw.githubusercontent.com/williamgilpin/cphy/main/hw/solutions/wormlike_chain.py', timeout=10).text, ns);
WormlikeChainMC = ns['WormlikeChainMC']
mc = WormlikeChainMC(N=200, b=1.0, kappa=30.0, beta=0.5, random_state=0)
positions_initial = mc.positions().copy()
mc.simulate(50)
positions_final = mc.positions().copy()
## side-by-side 3d subplots of the initial and final positions
fig = plt.figure(figsize=(10, 5))
ax1 = fig.add_subplot(1, 2, 1, projection="3d")
ax2 = fig.add_subplot(1, 2, 2, projection="3d")
ax1.plot(positions_initial[:, 0], positions_initial[:, 1], positions_initial[:, 2])
ax2.plot(positions_final[:, 0], positions_final[:, 1], positions_final[:, 2])
plt.show()

We can also make a video of the chain by running sweeps in small batches, and storing the positions at each step.
## Run a few sweeps at a time to get a video of the chain
all_positions = []
mc = WormlikeChainMC(N=200, b=1.0, kappa=30.0, beta=0.5, random_state=0)
for _ in range(300):
mc.simulate(4)
all_positions.append(mc.positions().copy())
all_positions = np.array(all_positions)
## Boilerplate code to make a video of the chain
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from IPython.display import HTML
frames = np.asarray(all_positions) # (T, N, 3)
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(projection='3d')
(line,) = ax.plot([], [], [], lw=1)
def init():
xyz = frames.reshape(-1, 3)
ax.set_xlim(xyz[:, 0].min(), xyz[:, 0].max())
ax.set_ylim(xyz[:, 1].min(), xyz[:, 1].max())
ax.set_zlim(xyz[:, 2].min(), xyz[:, 2].max())
ax.set_xticks([]); ax.set_yticks([]); ax.set_zticks([])
return (line,)
def update(i):
"""
Args:
i (int): Frame index.
Returns:
(tuple): Artists to redraw.
"""
P = frames[i] # (N, 3)
line.set_data(P[:, 0], P[:, 1])
line.set_3d_properties(P[:, 2])
return (line,)
ani = FuncAnimation(fig, update, frames=len(frames), init_func=init, interval=50, blit=False)
HTML(ani.to_jshtml())
# from matplotlib.animation import FFMpegWriter
# Save to MP4 (requires ffmpeg installed and on PATH)
# ani.save("wormlike_chain_trajectory.mp4", writer=FFMpegWriter(fps=20))
We can now run many simulations of different chains to compute the average tangent correlation and end-to-end distance. This step can take a while to run, and so you may need to adjust the number of sweeps to run.
all_tcorr = [] # list of arrays, each array is the tangent correlation for a different random experiment
all_r2 = [] # list of arrays, each array is the end-to-end distance for a different random experiment
for seed in range(30):
print(f"seed={seed}", flush=True)
mc = WormlikeChainMC(N=200, b=1.0, kappa=30.0, beta=0.5, random_state=seed)
mc.simulate(4000)
all_tcorr.append(mc.tangent_correlation()[1])
all_r2.append(mc.end_to_end()[1])
seed=0
seed=1
seed=2
seed=3
seed=4
seed=5
seed=6
seed=7
seed=8
seed=9
seed=10
seed=11
seed=12
seed=13
seed=14
seed=15
seed=16
seed=17
seed=18
seed=19
seed=20
seed=21
seed=22
seed=23
seed=24
seed=25
seed=26
seed=27
seed=28
seed=29
We can now plot the average tangent correlation and end-to-end distance, and compare them to the theoretical predictions.
## Plot the tangent correlation
plt.plot(np.array(all_tcorr).T, color="black", alpha=0.1)
scale = mc.tangent_correlation()[0]
plt.plot(np.mean(np.array(all_tcorr), axis=0), color="black", label="Monte Carlo Average")
plt.plot(np.exp(-scale / (mc.kappa * mc.b) / 0.5), color="blue", linestyle="--", label="Theory")
plt.xlabel("Distance along chain")
plt.ylabel("Tangent correlation")
plt.legend(frameon=False)
plt.show()
# Predicted end-to-end distance
lp = mc.lp * 0.5
L = mc.N * mc.b
R2_theory = 2 * lp * L * (1 - (lp / L) * (1 - np.exp(-L / lp)))
print(R2_theory)
print(np.median(np.array(all_r2)))
plt.hist(np.array(all_r2));
plt.xlabel("End-to-end distance squared")
plt.ylabel("Frequency")
plt.show()

2887.500000000295
5760.058994004552

- Geggier, S., & Vologodskii, A. (2010). Sequence dependence of DNA bending rigidity. Proceedings of the National Academy of Sciences, 107(35), 15421–15426. 10.1073/pnas.1004809107