RFdiffusion (patched version)
Overview
RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc). It can perform a whole range of protein design challenges:
- Motif Scaffolding
- Unconditional protein generation
- Symmetric unconditional generation (cyclic, dihedral and tetrahedral symmetries currently implemented, more coming!)
- Symmetric motif scaffolding
- Binder design
- Design diversification ("partial diffusion", sampling around a design)
Source
Watson, J.L., Juergens, D., Bennett, N.R. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023). https://doi.org/10.1038/s41586-023-06415-8
Scope of This Documentation
This documentation does not aim to explain RFdiffusion itself, nor provide detailed guidance on how to use its various generative modeling modes.
Instead, it focuses on the specificities of using RFdiffusion on the RPBS HPC cluster, including:
- the patched version maintained by RPBS and its user-visible differences
- how to load the environment via
module load
- a generic SLURM job script example to run predictions
- practical examples that illustrate key HPC-specific features:
- how model weights are handled
- how output directories and prefixes are set
- how schedule directories are automatically managed or overridden
For complete usage instructions, prediction modes, and advanced options, please refer to the official RFdiffusion documentation.
RFdiffusion Patched Version: What Changes For You?
The RFdiffusion version deployed on RPBS HPC is a patched version of the official v1.1.0
release.
For users, this means:
- you don’t need to download or specify the model weights.
- the models are automatically loaded from
/shared/banks/ckpt_models/rfdiffusion/1.1.0/
.
- the models are automatically loaded from
- you don’t need to set the schedule cache path.
- the
.schedules
directory is automatically created inside your output folder (${outdir}/.schedules
).
- the
- bug fixes are already applied
- these corrections improve stability and compatibility but require no action from your side. View changelog
Getting Started on RPBS HPC
This section explains how to launch RFdiffusion jobs on the RPBS HPC infrastructure using the patched version.
Loading the Environment
Before using RFdiffusion, load the preconfigured module environment:
module load rfdiffusion/1.1.0-patched-rpbs
This sets up:
- a containerized RFdiffusion runtime environment with GPU support
- ready-to-use CLI entry points like
run_inference.py
You should now have access to the run_inference.py
main entrypoint options:
srun run_inference.py -h
SLURM Job Script Example
If you want to follow the following example, start by creating an input directory and downloading the example protein:
mkdir -p inputs
wget -O inputs/5TPN.pdb https://files.rcsb.org/download/5TPN.pdb
Below is a minimal SLURM script for running a motif scaffolding task. It will:
- use 5TPN.pdb as the input structure (must exist in inputs/ directory)
- anchor the design on a fixed segment: residues A163–181
- inpaint 10–20 residues on both N- and C-terminal sides of this anchor
- generate two designed structures
- write the outputs to the
results/
folder, with filenames starting withmydesign
#!/bin/bash
#SBATCH --job-name=rfdiff # Job name
#SBATCH --account=ACCOUNT-PROJECT-NAME # Accounting project
#SBATCH --nodelist=gpu-node18 # Request specific node with GPU
#SBATCH --gres=gpu:1 # Request 1 GPU
#SBATCH --cpus-per-task=4 # Request 4 CPU cores
#SBATCH --mem=4G # Request 4 GB of RAM
#SBATCH --time=01:00:00 # Set a 1 hour time limit
#SBATCH --output=logs/rfdiff_%j.out # Save standard output to log file
#SBATCH --error=logs/rfdiff_%j.err # Save standard error to log file
# Load the rfdiffusion environment
module load rfdiffusion/1.1.0-patched-rpbs
# Run RFdiffusion – Motif Scaffolding Task
run_inference.py \
inference.input_pdb=inputs/5TPN.pdb \
inference.outdir=results \
inference.output_prefix=mydesign \
inference.num_designs=1 \
'contigmap.contigs=[10-20/A163-181/10-20]'
After running the SLURM job shown above, your results/
directory should contain the following:
Path / Filename | Description |
---|---|
logs_2025-06-02_08-50-47/ |
Log output directory from the RFdiffusion run |
└── run_inference.log |
Log file capturing stdout/stderr |
mydesign_0.pdb |
Final designed structure – model 0 |
mydesign_0.trb |
Internal metadata (pickle trace bundle) – model 0 |
mydesign_1.pdb |
Final designed structure – model 1 |
mydesign_1.trb |
Internal metadata (pickle trace bundle) – model 1 |
traj/ |
Folder containing reverse-ordered denoising trajectories |
├── mydesign_0_pX0_traj.pdb |
Model predictions at each timestep – model 0 |
├── mydesign_0_Xt-1_traj.pdb |
Inputs to the model at each timestep – model 0 |
├── mydesign_1_pX0_traj.pdb |
Model predictions at each timestep – model 1 |
└── mydesign_1_Xt-1_traj.pdb |
Inputs to the model at each timestep – model 1 |
.schedules/ |
Automatically created cache folder for diffusion schedules |
└── T_50_omega_1000_min_...pkl |
Pickle file with precomputed schedule parameters |
This format is ideal for ReadTheDocs, GitLab Pages, or GitHub READMEs and gives just the right amount
See the official documentation for more details.
Specific Patched Version Behavior
1. Model Weights – No Download Required
By default, RFdiffusion uses pretrained weights stored on shared banks:
/shared/banks/ckpt_models/rfdiffusion/1.1.0/
The models has been downloaded from the script download_models.sh.
Setting the path to the models folder is thus not required anymore. However, to override the default path used to access RFdiffusion models use:
export MODELS_PATH=/your/custom/models/path
2. Output Prefix and Directory – Independent Control
You can separately define the output directory and filename prefix:
inference.output_prefix=mydesign
inference.outdir=results/
3. Schedule Cache Directory – Automatic Handling
You don’t need to set the .schedules
path manually. The RFdiffusion patched version automatically creates it inside your output folder: ${outdir}/.schedules/
.
If needed, you can override the location using: inference.schedule_directory_path=/my/custom/cache/