AlphaFold 2 by RPBS

RPBS UPC

Connection

To use AlphaFold 2 by RPBS connect on the Jupyterhub interface. Use your ipop-up account username and password.

You can register on ipop-up by mail at ipop-up-account-request@rpbs.univ-paris-diderot.fr or on RPBS discourse.

Jupyter Hub

Server Options

Server

Here you can define the ressources needed by your job. You can choose the number of CPU, the amount of RAM and the GPU card.

  • To run AlphaFold 2 you need at least 1 GPU card A100, you can choose different slice of the A100 card 1, 2, 4 or 7 GPUs:

    • A100 1G20B: ~700 residues
    • A100 2G20B: ~150 residues
    • A100 4G40B: ~3000 residues
    • A100 7G80B: ~4000 residues
  • Concerning the RAM you should reserve at least 20 GB.

  • One or two CPU are enough to run the job.

Create and running a Notebook

To create a new notebook click on the Notebook Section and select Colabfold 1.5.5 Kernel.

  • In the first cell type:
from colabfold_jupyter import interface
  • Execute the cell by pressing by clicking the button or by typing Shift + Enter.
Run Cell
  • the interface should appear, if not type:
interface.show_widgets()

You can now admire the interface and run AlphaFold 2.

Interface

Running AlphaFold 2

  • To run AlphaFold 2 you need to provide a protein sequence in the Sequence field. You can provide multiple sequences separated by a :.

e.g. for a dimer of beta amyloid:

DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVVIA:DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVVIA - You can also specify numerous options such as: - the number of models to generate - the number of recycles - Different msa options - ...

Once you have filled the form, click on the Launch Colabfold button to start the job.

Start

Downloading the results

Once the job is finished, you can download the results by clicking on the Download button.

Download

The results are stored in a zip file that you can extract on your computer. Results are also stored in your project directory on the server (/shared/project/XXX/).

Analyzing the results

You can analyze the results by using the show_pdb_best() function.

interface.show_pdb_best(interface.results)
Show best

Going further

You can find more information on the AlphaFold 2 website. If you have any questions or need help, you can ask on the RPBS discourse.

AF2 Analysis

We have also developed and integrated the af2_analysis library to analyze the results of AlphaFold 2 results.

To use it you need to load the library and create a Data object, using the output directory as input:

import af2_analysis
my_data = af2_analysis.Data(interface.results['dir'])

All computed data are extracted in the my_data.df pandas DataFrame.

my_data.df

Some additional functions are available to complement the alphafold scores, like the pdockq and pdockq2.

my_data.compute_pdockq()
my_data.compute_pdockq2()

It is possible to plot the plddt scores:

my_data.plot_plddt(range(20))
Plddt

or the PAE matrix:

my_data.plot_pae(my_data.df['ranking_confidence'].idxmax())
PAE

Acknowledgements

  • Julien Rey and Samuel Murail for the deployment of the AlphaFold 2 server at RPBS.
  • Nicolas Chevrollier, Gabriel Tourillon, Gautier Moroy and Pierre Tuffery
  • The RPBS platform for computational resources.
  • IdEx Université Paris Cité n°ANR-18-IDEX-0001 projet GPU-APBS 2023.

Reference:

  • Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. ColabFold: making protein folding accessible to all.
    Nature Methods (2022) doi: 10.1038/s41592-022-01488-1

  • Jumper et al. "Highly accurate protein structure prediction with AlphaFold."
    Nature (2021) doi: 10.1038/s41586-021-03819-2

  • Evans et al. "Protein complex prediction with AlphaFold-Multimer."
    biorxiv (2021) doi: 10.1101/2021.10.04.463034v1