RFdiffusion Potential类解读-USB迷|专注于互联网分享

1. `Potential` 类

功能

Potential 是一个接口类（抽象类），用于定义潜在函数的接口，要求继承它的类必须实现 compute 方法。
它的设计遵循 面向对象编程的多态性原则，通过抽象接口确保子类实现特定功能，同时定义了一个通用的 API（即 compute 方法）。

源代码：

class Potential:
    '''
        Interface class that defines the functions a potential must implement
    '''

    def compute(self, xyz):
        '''
            Given the current structure of the model prediction, return the current
            potential as a PyTorch tensor with a single entry

            Args:
                xyz (torch.tensor, size: [L,27,3]: The current coordinates of the sample
            
            Returns:
                potential (torch.tensor, size: [1]): A potential whose value will be MAXIMIZED
                                                     by taking a step along it's gradient
        '''
        raise NotImplementedError('Potential compute function was not overwritten')

要点

compute 方法：
- 输入 xyz：蛋白质当前坐标，形状为 [L, 27, 3]。
  - L: 残基数量。
  - 27: 每个残基的原子数量。
  - 3: 空间坐标。
- 输出 potential：一个标量张量，用于度量当前结构的潜在能量。
设计成抽象方法，调用时如果未被子类覆盖，会直接抛出 NotImplementedError。

作用

提供一个统一的接口，便于扩展不同的潜在函数（例如评估分子压缩、能量、距离约束等）。

2. `monomer_ROG` 类

功能

monomer_ROG 是 Potential 的子类，定义了一个用于评估单体分子紧凑性（compactness）的潜在函数，基于回转半径（Radius of Gyration, ROG）。
通过 compute 方法计算分子的回转半径，并返回一个负数值作为潜在能量（模型会尝试最小化回转半径）。

源码+注释：

class monomer_ROG(Potential):
    '''
        Radius of Gyration potential for encouraging monomer compactness
    '''

    def __init__(self, weight=1, min_dist=15):
        self.weight = weight  # 权重，调节 ROG 的影响程度
        self.min_dist = min_dist  # 最小距离，限制过短的距离

    def compute(self, xyz):
        Ca = xyz[:, 1]  # [L, 3]，提取主链 Cα 原子的坐标
        
        centroid = torch.mean(Ca, dim=0, keepdim=True)  # 计算质心，形状 [1, 3]
        
        dgram = torch.cdist(
            Ca[None, ...].contiguous(), centroid[None, ...].contiguous(), p=2
        )  # 计算每个 Cα 原子到质心的欧几里得距离，形状 [1, L, 1, 3]

        dgram = torch.maximum(
            self.min_dist * torch.ones_like(dgram.squeeze(0)), dgram.squeeze(0)
        )  # 将距离限制在 min_dist 以上，形状 [L, 1, 3]
        
        rad_of_gyration = torch.sqrt(
            torch.sum(torch.square(dgram)) / Ca.shape[0]
        )  # 计算回转半径，形状 [1]
        
        return -1 * self.weight * rad_of_gyration

源码解读：

1. 提取主链 Cα 坐标：

Ca = xyz[:, 1]

提取所有残基的 Cα 原子坐标（形状为 [L, 3]）。
Cα 原子是蛋白质主链的骨架原子，用于计算质心和回转半径。

2. 计算质心：

centroid = torch.mean(Ca, dim=0, keepdim=True)

质心是所有 Cα 原子坐标的均值，表示分子中心点。

3. 计算 Cα 到质心的距离：

dgram = torch.cdist(
    Ca[None, ...].contiguous(), centroid[None, ...].contiguous(), p=2
)

使用 PyTorch 的 torch.cdist 函数，计算每个 Cα 原子到质心的欧几里得距离。
输出形状为 [1, L, 1, 3]，通过 squeeze(0) 简化为 [L, 1, 3]。

4. 限制最小距离：

dgram = torch.maximum(
    self.min_dist * torch.ones_like(dgram.squeeze(0)), dgram.squeeze(0)
)

如果距离小于 min_dist，将其设为 min_dist，避免回转半径变得过小。

5. 计算回转半径：

rad_of_gyration = torch.sqrt(
    torch.sum(torch.square(dgram)) / Ca.shape[0]
)

6. 返回潜在能量：

return -1 * self.weight * rad_of_gyration

返回负的回转半径（通过乘以权重调整其影响程度）。潜在函数的值会在优化过程中被最大化，因此负号是必要的。

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

回转半径是描述分子几何分布的一种统计量，用于衡量分子中各个原子围绕中心点的紧密程度。其数学定义为：

2. 为什么只计算 alpha 碳的回转半径？

alpha 碳的生物学意义：
- Alpha 碳（Cα）是蛋白质主链中每个氨基酸的骨架原子，位于 N-C-C 主链结构中间。
- Cα 原子直接反映了蛋白质骨架的整体形状，因此是分析蛋白质空间结构的关键点。
简化计算：
- 蛋白质中包含大量的原子（如侧链原子、氢原子等），如果考虑所有原子，会显著增加计算复杂度。
- 使用 Cα 原子作为代表，既简化了计算，也保留了蛋白质空间结构的主要信息。
紧凑性衡量的核心：
- 紧凑性是蛋白质结构设计中的一个核心属性，主要由主链决定。Cα 的回转半径可以很好地反映主链的紧凑性，而无需考虑侧链的细节。

3. RFDiffusion 中回转半径优化的意义

1. `Potential` 类

功能

Potential 是一个接口类（抽象类），用于定义潜在函数的接口，要求继承它的类必须实现 compute 方法。
它的设计遵循 面向对象编程的多态性原则，通过抽象接口确保子类实现特定功能，同时定义了一个通用的 API（即 compute 方法）。

源代码：

class Potential:
    '''
        Interface class that defines the functions a potential must implement
    '''

    def compute(self, xyz):
        '''
            Given the current structure of the model prediction, return the current
            potential as a PyTorch tensor with a single entry

            Args:
                xyz (torch.tensor, size: [L,27,3]: The current coordinates of the sample
            
            Returns:
                potential (torch.tensor, size: [1]): A potential whose value will be MAXIMIZED
                                                     by taking a step along it's gradient
        '''
        raise NotImplementedError('Potential compute function was not overwritten')

要点

compute 方法：
- 输入 xyz：蛋白质当前坐标，形状为 [L, 27, 3]。
  - L: 残基数量。
  - 27: 每个残基的原子数量。
  - 3: 空间坐标。
- 输出 potential：一个标量张量，用于度量当前结构的潜在能量。
设计成抽象方法，调用时如果未被子类覆盖，会直接抛出 NotImplementedError。

作用

提供一个统一的接口，便于扩展不同的潜在函数（例如评估分子压缩、能量、距离约束等）。

2. `monomer_ROG` 类

功能

monomer_ROG 是 Potential 的子类，定义了一个用于评估单体分子紧凑性（compactness）的潜在函数，基于回转半径（Radius of Gyration, ROG）。
通过 compute 方法计算分子的回转半径，并返回一个负数值作为潜在能量（模型会尝试最小化回转半径）。

源码+注释：

class monomer_ROG(Potential):
    '''
        Radius of Gyration potential for encouraging monomer compactness
    '''

    def __init__(self, weight=1, min_dist=15):
        self.weight = weight  # 权重，调节 ROG 的影响程度
        self.min_dist = min_dist  # 最小距离，限制过短的距离

    def compute(self, xyz):
        Ca = xyz[:, 1]  # [L, 3]，提取主链 Cα 原子的坐标
        
        centroid = torch.mean(Ca, dim=0, keepdim=True)  # 计算质心，形状 [1, 3]
        
        dgram = torch.cdist(
            Ca[None, ...].contiguous(), centroid[None, ...].contiguous(), p=2
        )  # 计算每个 Cα 原子到质心的欧几里得距离，形状 [1, L, 1, 3]

        dgram = torch.maximum(
            self.min_dist * torch.ones_like(dgram.squeeze(0)), dgram.squeeze(0)
        )  # 将距离限制在 min_dist 以上，形状 [L, 1, 3]
        
        rad_of_gyration = torch.sqrt(
            torch.sum(torch.square(dgram)) / Ca.shape[0]
        )  # 计算回转半径，形状 [1]
        
        return -1 * self.weight * rad_of_gyration

源码解读：

1. 提取主链 Cα 坐标：

Ca = xyz[:, 1]

提取所有残基的 Cα 原子坐标（形状为 [L, 3]）。
Cα 原子是蛋白质主链的骨架原子，用于计算质心和回转半径。

2. 计算质心：

centroid = torch.mean(Ca, dim=0, keepdim=True)

质心是所有 Cα 原子坐标的均值，表示分子中心点。

3. 计算 Cα 到质心的距离：

dgram = torch.cdist(
    Ca[None, ...].contiguous(), centroid[None, ...].contiguous(), p=2
)

使用 PyTorch 的 torch.cdist 函数，计算每个 Cα 原子到质心的欧几里得距离。
输出形状为 [1, L, 1, 3]，通过 squeeze(0) 简化为 [L, 1, 3]。

4. 限制最小距离：

dgram = torch.maximum(
    self.min_dist * torch.ones_like(dgram.squeeze(0)), dgram.squeeze(0)
)

如果距离小于 min_dist，将其设为 min_dist，避免回转半径变得过小。

5. 计算回转半径：

rad_of_gyration = torch.sqrt(
    torch.sum(torch.square(dgram)) / Ca.shape[0]
)

6. 返回潜在能量：

return -1 * self.weight * rad_of_gyration

返回负的回转半径（通过乘以权重调整其影响程度）。潜在函数的值会在优化过程中被最大化，因此负号是必要的。

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

回转半径是描述分子几何分布的一种统计量，用于衡量分子中各个原子围绕中心点的紧密程度。其数学定义为：

2. 为什么只计算 alpha 碳的回转半径？

alpha 碳的生物学意义：
- Alpha 碳（Cα）是蛋白质主链中每个氨基酸的骨架原子，位于 N-C-C 主链结构中间。
- Cα 原子直接反映了蛋白质骨架的整体形状，因此是分析蛋白质空间结构的关键点。
简化计算：
- 蛋白质中包含大量的原子（如侧链原子、氢原子等），如果考虑所有原子，会显著增加计算复杂度。
- 使用 Cα 原子作为代表，既简化了计算，也保留了蛋白质空间结构的主要信息。
紧凑性衡量的核心：
- 紧凑性是蛋白质结构设计中的一个核心属性，主要由主链决定。Cα 的回转半径可以很好地反映主链的紧凑性，而无需考虑侧链的细节。

USB迷 | 专注于互联网分享

RFdiffusion Potential类解读

1. `Potential` 类

功能

源代码：

要点

作用

2. `monomer_ROG` 类

功能

源码+注释：

源码解读：

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

2. 为什么只计算 alpha 碳的回转半径？

3. RFDiffusion 中回转半径优化的意义

1. `Potential` 类

功能

源代码：

要点

作用

2. `monomer_ROG` 类

功能

源码+注释：

源码解读：

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

2. 为什么只计算 alpha 碳的回转半径？

3. RFDiffusion 中回转半径优化的意义

与本文相关的文章

评论列表 (0)

USB迷 | 专注于互联网分享

1. Potential 类

功能

源代码：

要点

作用

2. monomer_ROG 类

功能

源码+注释：

源码解读：

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

2. 为什么只计算 alpha 碳的回转半径？

3. RFDiffusion 中回转半径优化的意义

1. Potential 类

功能

源代码：

要点

作用

2. monomer_ROG 类

功能

源码+注释：

源码解读：

关于回转半径（Radius of Gyration, ROG）

1. 回转半径（Radius of Gyration, ROG）

2. 为什么只计算 alpha 碳的回转半径？

3. RFDiffusion 中回转半径优化的意义

与本文相关的文章

评论列表 (0)

1. `Potential` 类

2. `monomer_ROG` 类

1. `Potential` 类

2. `monomer_ROG` 类