diff --git a/docs/zh/examples/symbolic_gn.md b/docs/zh/examples/symbolic_gn.md new file mode 100644 index 000000000..bdd562088 --- /dev/null +++ b/docs/zh/examples/symbolic_gn.md @@ -0,0 +1,313 @@ +# Graph Networks for Physics Discovery (GN) + +AI Studio快速体验 + +开始训练、评估前,请先确保已安装依赖 `scipy`、`scikit-learn`、`celluloid`、`pgl`,请运行安装命令: +```bash +pip install scipy scikit-learn celluloid pgl +``` + +=== "模型训练命令" + + ``` sh + python gn.py mode=train MODEL.arch=OGN DATA.type=spring + ``` +=== "模型评估命令" + + ``` sh + python gn.py mode=eval MODEL.arch=OGN EVAL.pretrained_model_path="./outputs_symbolic_gcn/.../checkpoints/latest" + ``` +=== "模型推理命令" + + ``` sh + python gn.py mode=infer MODEL.arch=OGN INFER.pretrained_model_path="./outputs_symbolic_gcn/.../checkpoints/latest" + ``` +=== "模型导出命令" + + ``` sh + python gn.py mode=export MODEL.arch=OGN INFER.pretrained_model_path="./outputs_symbolic_gcn/.../checkpoints/latest" + ``` + +## 1. 背景简介 + +深度学习方法在物理系统建模中展现出强大能力,但传统神经网络往往是黑盒模型,难以提供物理解释。图神经网络(GNNs)通过其固有的归纳偏置,特别适合建模粒子系统中的相互作用。然而,如何从训练好的GNN中提取可解释的物理定律仍然是一个挑战。 + +基于论文 **[Discovering Symbolic Models from Deep Learning with Inductive Biases](https://arxiv.org/abs/2006.11287)** 的思想,我们开发了一个完整的物理发现框架。该框架通过在GNN训练过程中引入强归纳偏置(如L1正则化鼓励稀疏表示),然后对学习到的模型内部组件应用符号回归,从而提取显式的物理关系。这种方法不仅能够重新发现已知的力定律和哈密顿量,还能从复杂数据中发现新的解析公式。 + +本案例使用图网络(Graph Networks, GN)对多种物理系统(如弹簧系统、引力系统、电荷系统等)的动力学进行建模,并通过符号回归提取潜在的物理定律。 + +## 2. 模型原理 + +本章节基于论文 **[Discovering Symbolic Models from Deep Learning with Inductive Biases](https://arxiv.org/abs/2006.11287)** (NeurIPS 2020) 的核心思想进行介绍。 + +### 2.1 核心问题 + +**传统困境**:深度学习模型(如全连接网络、CNN)虽然强大,但缺乏物理解释性。我们能否既利用深度学习的拟合能力,又能提取可解释的物理定律? + +**本文方案**: +1. 使用具有**归纳偏置**的架构(图网络)学习物理系统 +2. 在训练中引入**稀疏性约束**,迫使模型学习简单表示 +3. 对学到的内部函数进行**符号回归**,提取解析表达式 +4. 用符号表达式替换神经网络组件,获得可解释模型 + +### 2.2 图网络架构 + +图网络(GN)是天然适合物理建模的架构,因为: +- **排列不变性**:粒子顺序不影响结果(符合物理对称性) +- **局部作用**:粒子间通过边交互(符合力的成对性) +- **可扩展性**:可以处理不同数量的粒子 + +GN的核心思想是将物理系统建模为图结构: +- **节点**:代表物理实体(如粒子),包含状态 xᵢ = [位置, 速度, 质量, 电荷] +- **边**:代表实体间的相互作用,通过消息传递机制计算 +- **全局属性**:可选的系统级属性(如总能量) + +GN内部包含三个可分离的函数组件: +- **边函数(消息函数)φₑ(xᵢ, xⱼ)**:计算从节点j到节点i的消息向量 +- **节点函数 φᵥ(xᵢ, Σmᵢⱼ)**:根据聚合的消息更新节点状态 +- **全局函数 φᵤ(Σxᵢ)**:计算系统的全局属性(如总能量) + +**关键**:这些函数组件(φₑ, φᵥ, φᵤ)天然对应物理中的基本概念: +- φₑ 对应**成对作用力**(如引力、库仑力) +- φᵥ 对应**运动方程**(如牛顿第二定律) +- φᵤ 对应**能量函数**(如哈密顿量) + +因此,如果我们能对这些函数进行符号回归,就能直接提取物理定律! + +### 2.3 物理发现框架 + +我们的物理发现框架包含以下步骤: + +1. **设计具有可分离内部结构的深度学习模型**:使用图网络作为核心归纳偏置,特别适合粒子系统建模 +2. **端到端训练模型**:使用可用数据训练GN模型 +3. **对模型内部函数进行符号回归**:对φₑ、φᵥ、φᵤ等组件拟合符号表达式 +4. **用符号表达式替换神经网络组件**:创建完全解析的物理模型 + +### 2.4 稀疏表示与正则化 + +**核心理论**:如果真实物理系统可以用线性潜在空间完美描述,且我们用足够大的消息维度训练GNN,那么学到的消息向量将是真实力向量的**线性变换**。 + +例如,对于引力系统 `F = -G·m₁m₂/r²·r̂`,消息向量可能学到: +``` +message = [c₁·dx, c₂·dy, c₃·m₁m₂/r², c₄·..., 0, 0, ...] +``` +其中只有少数几个通道是活跃的,其余为零。 + +为了鼓励这种稀疏性,我们在训练中使用正则化策略: + +| 正则化方法 | 适用模型 | 作用机制 | 配置参数 | +|-----------|---------|---------|---------| +| **L1正则化** | OGN | 对节点特征施加L1惩罚,迫使网络学习稀疏激活 | `l1_strength: 1e-2` | +| **KL正则化** | VarOGN | 使后验分布接近稀疏先验,自然产生稀疏性 | `kl_weight: 1e-3` | +| **瓶颈架构** | 所有 | 限制消息维度(msg_dim),迫使信息压缩 | `msg_dim: 100` | + +**为什么稀疏性重要**: +1. **符号回归可行性**:稀疏表示意味着只有少数变量参与,符号回归搜索空间大幅减小 +2. **物理解释性**:活跃的通道对应真实的物理量组合 +3. **泛化能力**:简单模型通常泛化更好(奥卡姆剃刀原则) + +## 3. 实现细节 + +### 3.1 数据集介绍 + +我们的数据生成器支持多种物理系统,包括: + +| 系统类型 | 势能函数 | 物理意义 | +|---------|---------|---------| +| `r2` | `-m₁m₂/r` | 引力/库仑力(吸引)| +| `r1` | `m₁m₂·log(r)` | 2D引力(涡旋)| +| `spring` | `(r-1)²` | 胡克定律(平衡长度=1)| +| `damped` | `(r-1)² + damping·v·x/n` | 带阻尼的弹簧 | +| `string` | `(r-1)² + y·q` | 弦 + 重力 | +| `charge` | `q₁q₂/r` | 库仑力(排斥/吸引)| +| `superposition` | `q₁q₂/r - m₁m₂/r` | 电磁+引力叠加 | +| `discontinuous` | 分段函数 | 模拟非光滑力(如碰撞)| +| `string_ball` | 弹簧 + 球体排斥 | 复杂约束系统 | + +数据生成器使用高精度ODE求解器(`scipy.integrate.odeint`)生成轨迹数据,确保物理准确性。 + +### 3.2 数据生成器详解 + +数据生成器(`simulate.py`)使用 PaddlePaddle 自动微分计算力,核心流程如下: + +1. **势能函数定义**:为每种物理系统定义势能函数 U(r) +2. **自动微分计算力**:`F = -∇U`,使用 `paddle.grad` 计算势能梯度 +3. **ODE求解**:使用 `scipy.integrate.odeint` 求解 `d²x/dt² = F/m` +4. **数据格式**: + - 输入:[x, y, vx, vy, charge, mass](2D系统) + - 标签:加速度 [ax, ay] 或导数 [dq/dt, dp/dt] + +### 3.3 模型构建 + +我们实现了三种核心模型架构: + +#### OGN (Object-based Graph Network) +- **物理基础**:牛顿力学 (F=ma) +- **输出**:加速度 a = F/m +- **网络结构**: + - 消息函数(Message Function):`φₑ: (xᵢ, xⱼ) → mᵢⱼ`,将源节点和目标节点特征映射到消息向量(维度100) + - 节点更新函数(Node Function):`φᵥ: (xᵢ, Σmᵢⱼ) → aᵢ`,聚合所有消息并输出加速度 +- **适用场景**:通用动力学系统(弹簧、引力、电荷等) +- **正则化**:L1正则化鼓励消息向量稀疏性,使符号回归更容易 + +#### VarOGN (Variational ODE Graph Network) +- **物理基础**:变分推断框架 +- **输出**:加速度均值和方差(不确定性量化) +- **网络结构**: + - 消息函数输出 μ 和 logσ² + - 训练时采样:`m = μ + ε·exp(logσ²/2)`,其中 ε ~ N(0,1) + - 推理时使用均值:`m = μ` +- **适用场景**:噪声数据/不确定性建模/鲁棒性评估 +- **正则化**:KL散度正则化使后验分布接近先验分布 + +#### HGN (Hamiltonian Graph Network) +- **物理基础**:哈密顿力学(能量守恒原理) +- **输出**:[dq/dt, dp/dt](广义坐标和广义动量的时间导数) +- **网络结构**: + - 成对能量函数:`Eᵢⱼ = φₑ(xᵢ, xⱼ)` + - 自身能量函数:`Eᵢ = φᵥ(xᵢ)` + - 哈密顿量:`H = Σᵢ Eᵢ + Σᵢⱼ Eᵢⱼ` + - 哈密顿方程:`dq/dt = ∂H/∂p`, `dp/dt = -∂H/∂q` +- **适用场景**:保守系统(无耗散)、长时间预测 +- **特点**:通过自动微分实现哈密顿方程,天然保证能量守恒(在无耗散系统中) + +模型通过配置文件中的 `MODEL.arch` 参数动态选择: + +```yaml +MODEL: + arch: "OGN" # 可选: "OGN", "VarOGN", "HGN" + input_keys: ["node_features", "edge_index"] + output_keys: ["acceleration"] # HGN: ["derivative"] + n_f: auto # 自动设置为 dimension * 2 + 2 + msg_dim: 100 # 消息维度(OGN/VarOGN) + ndim: auto # 自动设置为 dimension + hidden: 300 # 隐藏层维度 +``` + +**输入输出格式**: + +| 模型 | 输入格式 | 输出格式 | 说明 | +|------|---------|---------|------| +| OGN | `node_features`: [batch, n_nodes, n_f]
`edge_index`: [batch, 2, n_edges] | `acceleration`: [batch, n_nodes, ndim] | 直接预测加速度 | +| VarOGN | 同上 | `acceleration`: [batch, n_nodes, ndim] | 预测加速度(均值) | +| HGN | 同上 | `derivative`: [batch, n_nodes, 2*ndim] | 前半部分为dq/dt,后半部分为dp/dt | + +其中 `n_f = 2*ndim + 2`(位置、速度、电荷、质量)。 + +### 3.4 约束构建 + +我们使用PPSCI的`SupervisedConstraint`构建监督约束。数据流程如下: + +1. **数据集准备**: + ```python + examples/symbolic_gn/gn.py:113:119 + ``` + +2. **约束构建**: + ```python + examples/symbolic_gn/gn.py:145:182 + ``` + +3. **损失函数**:根据模型类型自动调整 + - **OGN/VarOGN**:MAE或MSE损失 + L1正则化(作用于节点特征) + - **HGN**:仅对加速度部分(derivative的后半部分)计算损失 + +**自动化配置**: +- `batch_size = auto`:自动计算为 `int(64 * (4 / num_nodes)²)` +- 训练/验证集划分:默认 0.8/0.2 +- 数据下采样:`downsample_factor = 5`,减少计算量 + +### 3.5 优化器构建 + +训练使用Adam优化器配合OneCycleLR学习率调度: + +```python +examples/symbolic_gn/gn.py:194:213 +``` + +关键超参数: + +```yaml +TRAIN: + epochs: 1000 + batch_size: auto # 自动计算 + optimizer: + learning_rate: 1e-3 + weight_decay: 1e-8 + lr_scheduler: + name: "OneCycleLR" + max_learning_rate: 1e-3 + final_div_factor: 1e5 # 最终学习率 = max_lr / final_div_factor + loss: + type: "MAE" # 可选 MAE 或 MSE +``` + +**学习率调度策略**: +- OneCycleLR:学习率先从 max_lr/divide_factor 增加到 max_lr,再逐渐降低到 max_lr/final_div_factor +- 适合快速收敛且避免过拟合 + +### 3.6 符号回归准备 + +训练GNN的目的是为符号回归做准备。根据论文思想,我们鼓励稀疏的潜在表示: + +**理论基础**: +- 如果真实物理系统有完美的线性潜在空间描述,训练后的GNN消息向量将是真实力向量的线性变换 +- L1正则化迫使网络学习稀疏表示,使得消息向量中只有少数维度是活跃的 +- 这些活跃的维度通常对应真实的物理量(如距离、质量乘积等) + +**实践流程**: + +1. **训练稀疏GNN**: + ```yaml + MODEL: + msg_dim: 100 # 消息维度 + regularization_type: "l1" + l1_strength: 1e-2 + ``` + +2. **提取消息数据**:训练过程中可以记录消息向量 + ```python + messages = model.msg_fnc(edge_features) # [num_edges, msg_dim] + # 保存消息及对应的物理特征(dx, dy, r, m1, m2等) + ``` + +3. **符号回归**:使用工具如 [PySR](https://github.com/MilesCranmer/PySR) 拟合符号表达式 + ```python + from pysr import PySRRegressor + + # 选择最活跃的消息通道 + active_channels = np.argsort(np.std(messages, axis=0))[-5:] + + # 对每个活跃通道进行符号回归 + for ch in active_channels: + model = PySRRegressor( + niterations=40, + binary_operators=["+", "*", "/", "-"], + unary_operators=["square", "sqrt", "neg"], + ) + model.fit(physical_features, messages[:, ch]) + print(f"Channel {ch}: {model.sympy()}") + ``` + +4. **发现物理定律**: + - 对于弹簧系统:可能发现 `m ~ (r-1)` (胡克定律) + - 对于引力系统:可能发现 `m ~ -m1*m2/r²` (万有引力定律) + - 对于电荷系统:可能发现 `m ~ q1*q2/r²` (库仑定律) + +## 4. 完整代码 + + +```python +examples/symbolic_gn/gn.py +``` + + +## 5. 预期结果 + + +## 6. 参考资料 + +- **论文**:[Discovering Symbolic Models from Deep Learning with Inductive Biases (NeurIPS 2020)](https://arxiv.org/abs/2006.11287) +- **作者**:Miles Cranmer, Alvaro Sanchez-Gonzalez, Peter Battaglia, Rui Xu, Kyle Cranmer, David Spergel, Shirley Ho +- **代码仓库**:[github.com/MilesCranmer/symbolic_deep_learning](https://github.com/MilesCranmer/symbolic_deep_learning) +- **符号回归工具**:[PySR - Python Symbolic Regression](https://github.com/MilesCranmer/PySR) diff --git a/examples/symbolic_gn/conf/config.yaml b/examples/symbolic_gn/conf/config.yaml new file mode 100644 index 000000000..5eaececd9 --- /dev/null +++ b/examples/symbolic_gn/conf/config.yaml @@ -0,0 +1,143 @@ +defaults: + - ppsci_default + - TRAIN: train_default + - TRAIN/ema: ema_default + - TRAIN/swa: swa_default + - EVAL: eval_default + - INFER: infer_default + - hydra/job/config/override_dirname/exclude_keys: exclude_keys_default + - _self_ +hydra: + run: + dir: outputs_symbolic_gcn/${now:%Y-%m-%d}/${now:%H-%M-%S}/${hydra.job.override_dirname} + job: + name: ${mode} + chdir: false + callbacks: + init_callback: + _target_: ppsci.utils.callbacks.InitCallback + sweep: + dir: ${hydra.run.dir} + subdir: ./ + +mode: train +seed: 42 +output_dir: ${hydra:run.dir} +log_freq: 20 +device: gpu + +TRAIN: + epochs: 1000 + batch_size: 32 + save_freq: 100 + eval_during_train: true + eval_freq: 100 + checkpoint_path: null + pretrained_model_path: null + train_split: 0.8 + optimizer: + learning_rate: 1e-3 + weight_decay: 1e-8 + lr_scheduler: + name: "OneCycleLR" + max_learning_rate: 1e-3 + final_div_factor: 1e5 + loss: + type: "MAE" + physics_weight: 0.0 + +EVAL: + pretrained_model_path: null + eval_with_no_grad: true + batch_size: 32 + message_analysis: + enabled: true + +export_path: "./exported_model" + +INFER: + pretrained_model_path: null + export_path: ./inference/ogn_model + pdmodel_path: ./inference/ogn_model.pdmodel + pdiparams_path: ./inference/ogn_model.pdiparams + onnx_path: ${INFER.export_path}.onnx + device: cpu + engine: native + precision: fp32 + batch_size: 1 + ir_optim: false + min_subgraph_size: 1 + gpu_mem: 2000 + gpu_id: 0 + max_batch_size: 1024 + num_cpu_threads: 10 + +MODEL: + arch: "OGN" + input_keys: ["x", "edge_index"] + output_keys: ["acceleration"] + n_f: auto + msg_dim: 100 + ndim: auto + dt: 0.1 + hidden: 300 + aggr: "add" + regularization_type: "l1" + l1_strength: 1e-2 + +DATA: + type: "spring" + num_samples: 1000 + num_nodes: 6 + dimension: 2 + time_steps: 100 + time_step_size: 0.01 + downsample_factor: 5 + sample_interval: 5 + +DATATYPE_TYPE: + - sim: "r1" + dt: [5e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "r2" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "spring" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] + - sim: "charge" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "superposition" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "damped" + dt: [2e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "discontinuous" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string_ball" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] diff --git a/examples/symbolic_gn/conf/config_hgn.yaml b/examples/symbolic_gn/conf/config_hgn.yaml new file mode 100644 index 000000000..487249799 --- /dev/null +++ b/examples/symbolic_gn/conf/config_hgn.yaml @@ -0,0 +1,135 @@ +defaults: + - ppsci_default + - TRAIN: train_default + - TRAIN/ema: ema_default + - TRAIN/swa: swa_default + - EVAL: eval_default + - INFER: infer_default + - hydra/job/config/override_dirname/exclude_keys: exclude_keys_default + - _self_ +hydra: + run: + dir: outputs_symbolic_gcn/${now:%Y-%m-%d}/${now:%H-%M-%S}/${hydra.job.override_dirname} + job: + name: ${mode} + chdir: false + callbacks: + init_callback: + _target_: ppsci.utils.callbacks.InitCallback + sweep: + dir: ${hydra.run.dir} + subdir: ./ + +mode: train +seed: 42 +output_dir: ${hydra:run.dir} +log_freq: 20 +device: gpu + +TRAIN: + epochs: 1000 + batch_size: 32 + save_freq: 10 + eval_during_train: true + eval_freq: 5 + checkpoint_path: null + pretrained_model_path: null + train_split: 0.8 + optimizer: + learning_rate: 1e-3 + weight_decay: 1e-8 + lr_scheduler: + name: "OneCycleLR" + max_learning_rate: 1e-3 + final_div_factor: 1e5 + loss: + type: "MAE" + physics_weight: 0.0 + message_analysis: + enabled: true + record_interval: 1 + test_sample_size: 10 + +EVAL: + pretrained_model_path: null + eval_with_no_grad: true + batch_size: 1024 + +export_path: "./exported_model_hgn" + +INFER: + pretrained_model_path: null + export_path: ./inference/hgn_model + device: gpu + engine: native + precision: fp32 + +MODEL: + arch: "HGN" + input_keys: ["x", "edge_index"] + output_keys: ["acceleration"] + n_f: auto + msg_dim: 100 + ndim: auto + dt: 0.1 + hidden: 300 + aggr: "add" + regularization_type: "energy" + l1_strength: 0.0 + +DATA: + type: "r1" + num_samples: 1000 + num_nodes: 4 + dimension: 2 + time_steps: 10 + time_step_size: 0.01 + downsample_factor: 5 + sample_interval: 5 + +DATATYPE_TYPE: + - sim: "r1" + dt: [5e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "r2" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "spring" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] + - sim: "charge" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "superposition" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "damped" + dt: [2e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "discontinuous" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string_ball" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] diff --git a/examples/symbolic_gn/conf/config_varogn.yaml b/examples/symbolic_gn/conf/config_varogn.yaml new file mode 100644 index 000000000..0a989b365 --- /dev/null +++ b/examples/symbolic_gn/conf/config_varogn.yaml @@ -0,0 +1,135 @@ +defaults: + - ppsci_default + - TRAIN: train_default + - TRAIN/ema: ema_default + - TRAIN/swa: swa_default + - EVAL: eval_default + - INFER: infer_default + - hydra/job/config/override_dirname/exclude_keys: exclude_keys_default + - _self_ +hydra: + run: + dir: outputs_symbolic_gcn/${now:%Y-%m-%d}/${now:%H-%M-%S}/${hydra.job.override_dirname} + job: + name: ${mode} + chdir: false + callbacks: + init_callback: + _target_: ppsci.utils.callbacks.InitCallback + sweep: + dir: ${hydra.run.dir} + subdir: ./ + +mode: train +seed: 42 +output_dir: ${hydra:run.dir} +log_freq: 20 +device: gpu + +TRAIN: + epochs: 1000 + batch_size: 32 + save_freq: 100 + eval_during_train: true + eval_freq: 5 + checkpoint_path: null + pretrained_model_path: null + train_split: 0.8 + optimizer: + learning_rate: 1e-3 + weight_decay: 1e-8 + lr_scheduler: + name: "OneCycleLR" + max_learning_rate: 1e-3 + final_div_factor: 1e5 + loss: + type: "MAE" + physics_weight: 0.0 + message_analysis: + enabled: true + record_interval: 1 + test_sample_size: 10 + +EVAL: + pretrained_model_path: null + eval_with_no_grad: true + batch_size: 1024 + +export_path: "./exported_model_varogn" + +INFER: + pretrained_model_path: null + export_path: ./inference/varogn_model + device: gpu + engine: native + precision: fp32 + +MODEL: + arch: "VarOGN" + input_keys: ["x", "edge_index"] + output_keys: ["acceleration"] + n_f: auto + msg_dim: 100 + ndim: auto + dt: 0.1 + hidden: 300 + aggr: "add" + regularization_type: "kl" + l1_strength: 1e-3 + +DATA: + type: "string" + num_samples: 1000 + num_nodes: 4 + dimension: 2 + time_steps: 10 + time_step_size: 0.01 + downsample_factor: 5 + sample_interval: 5 + +DATATYPE_TYPE: + - sim: "r1" + dt: [5e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "r2" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "spring" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] + - sim: "charge" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "superposition" + dt: [1e-3] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "damped" + dt: [2e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "discontinuous" + dt: [1e-2] + nt: [1000] + n: [4, 8] + dim: [2, 3] + - sim: "string_ball" + dt: [1e-2] + nt: [1000] + n: [30] + dim: [2] diff --git a/examples/symbolic_gn/gn.py b/examples/symbolic_gn/gn.py new file mode 100644 index 000000000..b453dc60d --- /dev/null +++ b/examples/symbolic_gn/gn.py @@ -0,0 +1,462 @@ +# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. + +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at + +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from typing import List +from typing import Optional + +import hydra +import matplotlib.pyplot as plt +import numpy as np +import paddle +from omegaconf import DictConfig +from simulate import SimulationDataset + +import ppsci +from ppsci.arch.symbolic_gn import HGN +from ppsci.arch.symbolic_gn import OGN +from ppsci.arch.symbolic_gn import VarOGN +from ppsci.arch.symbolic_gn import get_edge_index +from ppsci.utils import logger + + +def create_ppsci_model( + model_type: str = "OGN", + input_keys: List[str] = ["x", "edge_index"], + output_keys: List[str] = ["acceleration"], + n_f: int = 6, + msg_dim: int = 100, + ndim: int = 2, + hidden: int = 300, + edge_index: Optional[np.ndarray] = None, + l1_strength: float = 0.0, +): + if model_type == "OGN": + model = OGN( + input_keys=input_keys, + output_keys=output_keys, + n_f=n_f, + msg_dim=msg_dim, + ndim=ndim, + hidden=hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + elif model_type == "HGN": + model = HGN( + input_keys=input_keys, + output_keys=output_keys, + n_f=n_f, + ndim=ndim, + hidden=hidden, + edge_index=edge_index, + ) + elif model_type == "VarOGN": + model = VarOGN( + input_keys=input_keys, + output_keys=output_keys, + n_f=n_f, + msg_dim=msg_dim, + ndim=ndim, + hidden=hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + else: + raise ValueError(f"未知的模型类型: {model_type}") + + return model + + +def replicate_array(arr, n): + return np.tile(arr, (n,) + (1,) * arr.ndim) + + +def create_loss_function(cfg): + def loss_function(output_dict, label_dict, weight_dict=None): + loss_type = getattr(cfg.TRAIN.loss, "type", "MAE") + + output_key = list(cfg.MODEL.output_keys)[0] + + target = label_dict[output_key] + pred = output_dict[output_key] + if loss_type == "MSE": + base_loss = paddle.mean(paddle.square(pred - target)) + else: + base_loss = paddle.mean(paddle.abs(pred - target)) + + if "l1_regularization" in output_dict: + base_loss = base_loss + output_dict["l1_regularization"] + + return {output_key: base_loss} + + return loss_function + + +def train(cfg): + ppsci.utils.misc.set_random_seed(cfg.seed) + + if cfg.MODEL.arch == "HGN": + cfg.MODEL.output_keys = ["acceleration"] + + sim = SimulationDataset( + sim=cfg.DATA.type, + n=cfg.DATA.num_nodes, + dim=cfg.DATA.dimension, + nt=cfg.DATA.time_steps, + dt=cfg.DATA.time_step_size, + ) + sim.simulate(cfg.DATA.num_samples) + accel_data = sim.get_acceleration() + + X_list = [] + y_list = [] + for sample_idx in range(cfg.DATA.num_samples): + for t in range(0, sim.data.shape[1], cfg.DATA.sample_interval): + X_list.append(sim.data[sample_idx, t]) + y_list.append(accel_data[sample_idx, t]) + + X = np.array(X_list, dtype=np.float32) + y = np.array(y_list, dtype=np.float32) + + if cfg.MODEL.n_f == "auto": + cfg.MODEL.n_f = cfg.DATA.dimension * 2 + 2 + if cfg.MODEL.ndim == "auto": + cfg.MODEL.ndim = cfg.DATA.dimension + + train_size = int(len(X) * 0.8) + X_train, X_val = X[:train_size], X[train_size:] + y_train, y_val = y[:train_size], y[train_size:] + edge_index = get_edge_index(cfg.DATA.num_nodes, cfg.DATA.type) + edge_index_train = replicate_array(edge_index, len(X_train)) + edge_index_val = replicate_array(edge_index, len(X_val)) + + train_constraint = ppsci.constraint.SupervisedConstraint( + { + "dataset": { + "name": "NamedArrayDataset", + "input": {"x": X_train, "edge_index": edge_index_train}, + "label": {cfg.MODEL.output_keys[0]: y_train}, + }, + "batch_size": cfg.TRAIN.batch_size, + "sampler": { + "name": "BatchSampler", + "drop_last": False, + "shuffle": True, + }, + }, + create_loss_function(cfg), + name="sup_constraint", + ) + eval_constraint = ppsci.constraint.SupervisedConstraint( + { + "dataset": { + "name": "NamedArrayDataset", + "input": {"x": X_val, "edge_index": edge_index_val}, + "label": {cfg.MODEL.output_keys[0]: y_val}, + }, + "batch_size": cfg.TRAIN.batch_size, + "sampler": { + "name": "BatchSampler", + "drop_last": False, + "shuffle": False, + }, + }, + create_loss_function(cfg), + name="eval_constraint", + ) + constraint = { + train_constraint.name: train_constraint, + eval_constraint.name: eval_constraint, + } + l1_strength = ( + cfg.MODEL.l1_strength + if cfg.MODEL.regularization_type == "l1" and cfg.MODEL.arch in ["OGN", "VarOGN"] + else 0.0 + ) + model = create_ppsci_model( + model_type=cfg.MODEL.arch, + n_f=cfg.MODEL.n_f, + msg_dim=cfg.MODEL.msg_dim, + ndim=cfg.MODEL.ndim, + hidden=cfg.MODEL.hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + + batch_per_epoch = int(train_size / cfg.TRAIN.batch_size) + if cfg.TRAIN.lr_scheduler.name == "OneCycleLR": + lr_scheduler = paddle.optimizer.lr.OneCycleLR( + max_learning_rate=cfg.TRAIN.lr_scheduler.max_learning_rate, + total_steps=int(cfg.TRAIN.epochs * batch_per_epoch), + divide_factor=cfg.TRAIN.lr_scheduler.final_div_factor, + ) + lr_scheduler.by_epoch = False + else: + lr_scheduler = paddle.optimizer.lr.ExponentialDecay( + learning_rate=cfg.TRAIN.optimizer.learning_rate, gamma=0.9 + ) + lr_scheduler.by_epoch = True + + optimizer = paddle.optimizer.Adam( + learning_rate=lr_scheduler, + parameters=model.parameters(), + weight_decay=cfg.TRAIN.optimizer.weight_decay, + ) + + solver = ppsci.solver.Solver( + model, + constraint, + cfg.output_dir, + optimizer, + epochs=cfg.TRAIN.epochs, + save_freq=cfg.TRAIN.save_freq, + ) + + solver.train() + solver.plot_loss_history(by_epoch=True, smooth_step=1) + paddle.save(model.state_dict(), os.path.join(f"{cfg.MODEL.arch}.pdparams")) + + +def evaluate(cfg: DictConfig): + if cfg.MODEL.n_f == "auto": + cfg.MODEL.n_f = cfg.DATA.dimension * 2 + 2 + if cfg.MODEL.ndim == "auto": + cfg.MODEL.ndim = cfg.DATA.dimension + + edge_index = get_edge_index(cfg.DATA.num_nodes, cfg.DATA.type) + l1_strength = ( + cfg.MODEL.l1_strength + if cfg.MODEL.regularization_type == "l1" and cfg.MODEL.arch in ["OGN", "VarOGN"] + else 0.0 + ) + + model = create_ppsci_model( + model_type=cfg.MODEL.arch, + n_f=cfg.MODEL.n_f, + msg_dim=cfg.MODEL.msg_dim, + ndim=cfg.MODEL.ndim, + hidden=cfg.MODEL.hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + + ppsci.utils.save_load.load_pretrain( + model, + cfg.EVAL.pretrained_model_path, + ) + + sim = SimulationDataset( + sim=cfg.DATA.type, + n=cfg.DATA.num_nodes, + dim=cfg.DATA.dimension, + nt=cfg.DATA.time_steps, + dt=cfg.DATA.time_step_size, + ) + sim.simulate(cfg.DATA.num_samples) + accel_data = sim.get_acceleration() + + sample_indices = [0, 1] if cfg.DATA.num_samples > 1 else [0] + + for sample_idx in sample_indices: + sample_data = sim.data[sample_idx, 0:1] + true_accel = accel_data[sample_idx, 0:1] + + input_dict = { + "x": paddle.to_tensor(sample_data, dtype="float32"), + "edge_index": paddle.to_tensor(edge_index, dtype="int64"), + } + + with paddle.no_grad(): + pred_output = model(input_dict) + pred_accel = pred_output[cfg.MODEL.output_keys[0]] + + error = np.mean(np.abs(pred_accel.numpy() - true_accel)) + logger.info(f"Sample {sample_idx} - MAE error: {error:.6f}") + + rel_error = np.linalg.norm(pred_accel.numpy() - true_accel) / np.linalg.norm( + true_accel + ) + logger.info(f"Sample {sample_idx} - Relative error: {rel_error:.6f}") + + plt.figure(figsize=(10, 8)) + sim.plot(sample_idx, animate=False, plot_size=True, s_size=2) + plt.title( + f"{cfg.DATA.type.capitalize()} System - Sample {sample_idx}\nMAE: {error:.4f}, Rel Error: {rel_error:.4f}" + ) + plt.tight_layout() + + plot_path = os.path.join(cfg.output_dir, f"evaluation_sample_{sample_idx}.png") + plt.savefig(plot_path, dpi=300, bbox_inches="tight") + plt.close() + logger.info(f"Evaluation plot saved to {plot_path}") + + +def export(cfg: DictConfig): + if cfg.MODEL.arch == "HGN": + raise ValueError("HGN is not supported for export") + + ppsci.utils.misc.set_random_seed(cfg.seed) + + if cfg.MODEL.n_f == "auto": + cfg.MODEL.n_f = cfg.DATA.dimension * 2 + 2 + if cfg.MODEL.ndim == "auto": + cfg.MODEL.ndim = cfg.DATA.dimension + edge_index = get_edge_index(cfg.DATA.num_nodes, cfg.DATA.type) + l1_strength = ( + cfg.MODEL.l1_strength + if cfg.MODEL.regularization_type == "l1" and cfg.MODEL.arch in ["OGN", "VarOGN"] + else 0.0 + ) + model = create_ppsci_model( + model_type=cfg.MODEL.arch, + n_f=cfg.MODEL.n_f, + msg_dim=cfg.MODEL.msg_dim, + ndim=cfg.MODEL.ndim, + hidden=cfg.MODEL.hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + + solver = ppsci.solver.Solver( + model, + pretrained_model_path=cfg.INFER.pretrained_model_path, + ) + + from paddle.static import InputSpec + + input_spec = [ + { + key: ( + InputSpec( + [None, cfg.DATA.num_nodes, cfg.MODEL.n_f], "float32", name=key + ) + if key == "x" + else InputSpec([2, 30], "int64", name=key) + ) + for key in model.input_keys + }, + ] + solver.export(input_spec, cfg.INFER.export_path) + + +def inference(cfg: DictConfig): + from deploy import python_infer + + try: + predictor = python_infer.GeneralPredictor(cfg) + use_predictor = True + except Exception as e: + logger.error(f"GeneralPredictor failed: {e}") + logger.info("Switching to direct model inference...") + use_predictor = False + + sim = SimulationDataset( + sim=cfg.DATA.type, + n=cfg.DATA.num_nodes, + dim=cfg.DATA.dimension, + nt=cfg.DATA.time_steps, + dt=cfg.DATA.time_step_size, + ) + sim.simulate(cfg.DATA.num_samples) + + sample_idx = 0 + sample_data = sim.data[sample_idx, 0] + edge_index = get_edge_index(cfg.DATA.num_nodes, cfg.DATA.type) + + if use_predictor: + input_dict = {"x": sample_data, "edge_index": edge_index} + output_dict = predictor.predict(input_dict, cfg.INFER.batch_size) + pred_acceleration = output_dict[cfg.MODEL.output_keys[0]] + else: + if cfg.MODEL.n_f == "auto": + cfg.MODEL.n_f = cfg.DATA.dimension * 2 + 2 + if cfg.MODEL.ndim == "auto": + cfg.MODEL.ndim = cfg.DATA.dimension + + l1_strength = ( + cfg.MODEL.l1_strength + if cfg.MODEL.regularization_type == "l1" + and cfg.MODEL.arch in ["OGN", "VarOGN"] + else 0.0 + ) + + model = create_ppsci_model( + model_type=cfg.MODEL.arch, + n_f=cfg.MODEL.n_f, + msg_dim=cfg.MODEL.msg_dim, + ndim=cfg.MODEL.ndim, + hidden=cfg.MODEL.hidden, + edge_index=edge_index, + l1_strength=l1_strength, + ) + + ppsci.utils.save_load.load_pretrain( + model, + cfg.INFER.pretrained_model_path, + ) + + input_dict = { + "x": paddle.to_tensor(sample_data, dtype="float32"), + "edge_index": paddle.to_tensor(edge_index, dtype="int64"), + } + + with paddle.no_grad(): + pred_output = model(input_dict) + pred_acceleration = pred_output[cfg.MODEL.output_keys[0]].numpy() + + accel_data = sim.get_acceleration() + true_acceleration = accel_data[sample_idx, 0] + + error = np.mean(np.abs(pred_acceleration - true_acceleration)) + logger.info(f"Inference error (MAE): {error:.6f}") + + rel_error = np.linalg.norm(pred_acceleration - true_acceleration) / np.linalg.norm( + true_acceleration + ) + logger.info(f"Inference relative error: {rel_error:.6f}") + + plt.figure(figsize=(10, 8)) + sim.plot(sample_idx, animate=False, plot_size=True, s_size=2) + plt.title( + f"{cfg.DATA.type.capitalize()} System - Inference\nMAE: {error:.4f}, Rel Error: {rel_error:.4f}" + ) + plt.tight_layout() + + plot_path = os.path.join(cfg.output_dir, "inference.png") + plt.savefig(plot_path, dpi=300, bbox_inches="tight") + plt.close() + logger.info(f"Inference plot saved to {plot_path}") + + +@hydra.main(version_base=None, config_path="./conf", config_name="config_hgn") +def main(cfg: DictConfig) -> None: + if cfg.mode == "train": + train(cfg) + elif cfg.mode == "eval": + evaluate(cfg) + elif cfg.mode == "export": + export(cfg) + elif cfg.mode == "infer": + inference(cfg) + else: + raise ValueError( + "cfg.mode should in [train, eval, export, infer], but got {}".format( + cfg.mode + ) + ) + + +if __name__ == "__main__": + main() diff --git a/examples/symbolic_gn/simulate.py b/examples/symbolic_gn/simulate.py new file mode 100644 index 000000000..0a9c2d99a --- /dev/null +++ b/examples/symbolic_gn/simulate.py @@ -0,0 +1,705 @@ +# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. + +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at + +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import matplotlib as mpl +import numpy as np +import paddle +from celluloid import Camera +from matplotlib import pyplot as plt +from scipy.integrate import odeint +from tqdm import tqdm + + +def make_transparent_color(ntimes, fraction): + """ + Create transparent colors for trajectory visualization. + + This function generates a gradient from white to a specified color, + with transparency increasing from 0 to 1. Used for drawing particle + trajectories where old positions are transparent and new positions are opaque. + + Args: + ntimes (int): Number of time steps, i.e., number of colors to generate. + fraction (float): Color fraction in range [0,1] for selecting color from spectrum. + + Returns: + numpy.ndarray: RGBA color array with shape (ntimes, 4). + Each row represents the color [R, G, B, Alpha] at a time step. + """ + rgba = np.ones((ntimes, 4)) + + alpha = np.linspace(0, 1, ntimes)[:, np.newaxis] + + color = np.array(mpl.colors.to_rgba(mpl.cm.gist_ncar(fraction)))[np.newaxis, :] + + rgba[:, :] = 1 * (1 - alpha) + color * alpha + + rgba[:, 3] = alpha[:, 0] + + return rgba + + +def get_potential(sim, sim_obj): + """ + Get potential energy function based on simulation type. + + This is a factory function that returns the corresponding potential energy + calculation function based on different physical scenarios. + Supported scenarios include: gravity, springs, charges, damped systems, etc. + + Args: + sim (str): Simulation type, options: + 'r2' - Inverse square law (e.g., universal gravitation) + 'r1' - Logarithmic potential + 'spring' - Spring potential + 'damped' - Damped spring + 'string' - String potential + 'string_ball' - String with ball obstacle + 'charge' - Charge interaction + 'superposition' - Superposition of gravity and charge + 'discontinuous' - Piecewise discontinuous potential + sim_obj (SimulationDataset): Simulation dataset object for obtaining parameters like dimension. + + Returns: + function: Potential energy function that takes two particle states and returns their potential energy. + """ + dim = sim_obj._dim + + def potential(x1, x2): + """ + Calculate potential energy between two particles. + + Particle state format: + 2D: [x, y, vx, vy, charge, mass] + 3D: [x, y, z, vx, vy, vz, charge, mass] + + Args: + x1 (numpy.ndarray): State vector of the first particle. + x2 (numpy.ndarray): State vector of the second particle. + + Returns: + float: Potential energy value between the two particles. + """ + dist = np.sqrt(np.sum(np.square(x1[:dim] - x2[:dim]))) + + min_dist = 1e-2 + bounded_dist = dist + min_dist + + if sim == "r2": + return -x1[-1] * x2[-1] / bounded_dist + + elif sim == "r1": + return x1[-1] * x2[-1] * np.log(bounded_dist) + + elif sim in ["spring", "damped"]: + potential_val = (bounded_dist - 1) ** 2 + + if sim == "damped": + damping = 1 + potential_val += damping * x1[1] * x1[1 + sim_obj._dim] / sim_obj._n + potential_val += damping * x1[0] * x1[0 + sim_obj._dim] / sim_obj._n + if sim_obj._dim == 3: + potential_val += damping * x1[2] * x1[2 + sim_obj._dim] / sim_obj._n + + return potential_val + + elif sim == "string": + + return (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + + elif sim == "string_ball": + potential_val = (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + + r = np.sqrt((x1[1] + 15) ** 2 + (x1[0] - 5) ** 2) + radius = 4.0 + k_repel = 100.0 + potential_val += k_repel / np.maximum(r - radius + 0.5, 0.01) + return potential_val + + elif sim in ["charge", "superposition"]: + + charge1 = x1[-2] + charge2 = x2[-2] + potential_val = charge1 * charge2 / bounded_dist + + if sim in ["superposition"]: + + m1 = x1[-1] + m2 = x2[-1] + potential_val += -m1 * m2 / bounded_dist + + return potential_val + + elif sim in ["discontinuous"]: + + m1 = x1[-1] + m2 = x2[-1] + + pot_a = 0.0 + pot_b = 0.0 + pot_c = (bounded_dist - 1) ** 2 + potential_val = pot_a * (bounded_dist < 1) + (bounded_dist >= 1) * ( + pot_b * (bounded_dist < 2) + pot_c * (bounded_dist >= 2) + ) + return potential_val + + else: + raise NotImplementedError("No such simulation " + str(sim)) + + return potential + + +class SimulationDataset(object): + """ + Physical system simulation dataset class (PaddlePaddle version). + + This class is used to generate and manage simulation data for many-body physical systems. + It simulates the time evolution of particle systems by numerically solving Newton's + equations of motion. + + Main features: + - Generate simulation data for multiple samples + - Compute system acceleration (using PaddlePaddle automatic differentiation) + - Visualize particle trajectories + - Support custom potential energy functions + + Attributes: + _sim (str): Simulation type. + _n (int): Number of particles. + _dim (int): Spatial dimension. + dt (float): Time step size. + nt (int): Number of time steps. + data (ndarray): Simulation data with shape (num_samples, num_timesteps, num_particles, state_dim). + times (ndarray): Time series. + G (float): Gravitational constant. + pairwise (function): Pairwise potential energy function. + """ + + def __init__( + self, sim="r2", n=5, dim=2, dt=0.01, nt=100, extra_potential=None, **kwargs + ): + """ + Initialize simulation dataset. + + Args: + sim (str, optional): Simulation type. Defaults to 'r2' (inverse square law). + n (int, optional): Number of particles. Defaults to 5. + dim (int, optional): Spatial dimension (2 or 3). Defaults to 2. + dt (float, optional): Time step size. Defaults to 0.01. + nt (int, optional): Number of time steps to return. Defaults to 100. + extra_potential (function, optional): Additional potential energy function. Defaults to None. + **kwargs: Other parameters reserved for extension. + """ + self._sim = sim + self._n = n + self._dim = dim + self._kwargs = kwargs + self.dt = dt + self.nt = nt + self.data = None + + self.times = np.linspace(0, self.dt * self.nt, num=self.nt) + + self.G = 1 + self.extra_potential = extra_potential + + self.pairwise = get_potential(sim=sim, sim_obj=self) + + def simulate(self, ns, seed=0): + """ + Generate simulation data. + + This method is the core function of the simulator. For each sample: + 1. Randomly initialize particle positions, velocities, masses, and charges + 2. Solve Newton's equations of motion using Scipy ODE solver + 3. Record the entire time evolution process + + Physical equations: + d²x/dt² = F/m = -∇U/m + where U is potential energy and F is force + + Args: + ns (int): Number of samples to generate. + seed (int, optional): Seed for numpy random number generator. Defaults to 0. + + Returns: + None: Results are stored in self.data. + """ + + np.random.seed(seed) + + n = self._n + dim = self._dim + sim = self._sim + params = 2 + total_dim = dim * 2 + params + times = self.times + G = self.G + + def compute_pairwise_potential(x1, x2_list): + """Vectorized calculation of potential energy between one particle and multiple particles""" + potentials = [] + for x2 in x2_list: + potentials.append(self.pairwise(x1, x2)) + return np.array(potentials) + + def total_potential(xt): + """ + Calculate total potential energy of the system (using Numpy). + + Args: + xt (numpy.ndarray): System state with shape (n, total_dim). + + Returns: + float: Total potential energy of the system. + """ + sum_potential = 0.0 + + for i in range(n - 1): + if sim in ["string", "string_ball"]: + + sum_potential += G * self.pairwise(xt[i], xt[i + 1]) + else: + + for j in range(i + 1, n): + sum_potential += G * self.pairwise(xt[i], xt[j]) + + if self.extra_potential is not None: + for i in range(n): + sum_potential += self.extra_potential(xt[i]) + + return sum_potential + + def force_paddle(xt): + """ + Calculate force using PaddlePaddle automatic differentiation. + + This is a key part of the migration: + JAX: force = -grad(total_potential)(xt) + Paddle: Use paddle.grad to calculate gradient + + Complete implementation for all potential scenarios. + + Args: + xt (numpy.ndarray): System state with shape (n, total_dim). + + Returns: + numpy.ndarray: Force on each particle with shape (n, dim). + """ + + xt_tensor = paddle.to_tensor(xt, dtype="float64", stop_gradient=False) + + positions = xt_tensor[:, :dim] + positions.stop_gradient = False + + xt_for_potential = paddle.concat([positions, xt_tensor[:, dim:]], axis=1) + + sum_potential = paddle.to_tensor(0.0, dtype="float64") + + for i in range(n - 1): + if sim in ["string", "string_ball"]: + + x1 = xt_for_potential[i] + x2 = xt_for_potential[i + 1] + dist = paddle.sqrt(paddle.sum(paddle.square(x1[:dim] - x2[:dim]))) + bounded_dist = dist + 1e-2 + + if sim == "string": + + pot = (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + + elif sim == "string_ball": + + pot = (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + + r = paddle.sqrt((x1[1] + 15) ** 2 + (x1[0] - 5) ** 2) + radius = paddle.to_tensor(4.0, dtype="float64") + + k_repel = 100.0 + pot = pot + k_repel / paddle.maximum( + r - radius + 0.5, paddle.to_tensor(0.01, dtype="float64") + ) + + sum_potential = sum_potential + G * pot + + else: + + for j in range(i + 1, n): + x1 = xt_for_potential[i] + x2 = xt_for_potential[j] + dist = paddle.sqrt( + paddle.sum(paddle.square(x1[:dim] - x2[:dim])) + ) + bounded_dist = dist + 1e-2 + + if sim == "r2": + + pot = -x1[-1] * x2[-1] / bounded_dist + + elif sim == "r1": + + pot = x1[-1] * x2[-1] * paddle.log(bounded_dist) + + elif sim in ["spring", "damped"]: + + pot = (bounded_dist - 1) ** 2 + + if sim == "damped": + damping = paddle.to_tensor(1.0, dtype="float64") + + pot = pot + damping * x1[1] * x1[1 + dim] / n + pot = pot + damping * x1[0] * x1[0 + dim] / n + if dim == 3: + pot = pot + damping * x1[2] * x1[2 + dim] / n + + elif sim in ["charge", "superposition"]: + + charge1 = x1[-2] + charge2 = x2[-2] + pot = charge1 * charge2 / bounded_dist + + if sim == "superposition": + + m1 = x1[-1] + m2 = x2[-1] + pot = pot + (-m1 * m2 / bounded_dist) + + elif sim == "discontinuous": + + pot_a = paddle.to_tensor(0.0, dtype="float64") + pot_b = paddle.to_tensor(0.0, dtype="float64") + pot_c = (bounded_dist - 1) ** 2 + + cond1 = paddle.cast(bounded_dist < 1, "float64") + cond2 = paddle.cast( + bounded_dist >= 1, "float64" + ) * paddle.cast(bounded_dist < 2, "float64") + cond3 = paddle.cast(bounded_dist >= 2, "float64") + + pot = pot_a * cond1 + pot_b * cond2 + pot_c * cond3 + + else: + + pot = (bounded_dist - 1) ** 2 + + sum_potential = sum_potential + G * pot + + if self.extra_potential is not None: + + for i in range(n): + + extra_pot = self.extra_potential(xt_for_potential[i]) + if not isinstance(extra_pot, paddle.Tensor): + extra_pot = paddle.to_tensor(extra_pot, dtype="float64") + sum_potential = sum_potential + extra_pot + + grads = paddle.grad( + outputs=sum_potential, + inputs=positions, + create_graph=False, + retain_graph=False, + )[0] + + force = -grads.numpy() + + return force + + def acceleration(xt): + """ + Calculate acceleration: a = F/m. + + Args: + xt (numpy.ndarray): System state. + + Returns: + numpy.ndarray: Acceleration. + """ + force = force_paddle(xt) + masses = xt[:, -1:] + return force / masses + + def odefunc(y, t): + """ + Ordinary differential equation function (for scipy.odeint). + + Args: + y (numpy.ndarray): Flattened system state. + t (float): Time. + + Returns: + numpy.ndarray: Time derivative of the state. + """ + + y = y.reshape((n, total_dim)) + + a = acceleration(y) + + dydt = np.concatenate( + [y[:, dim : 2 * dim], a, np.zeros((n, params))], axis=1 + ) + + return dydt.flatten() + + def make_sim(sample_idx): + """ + Generate single simulation sample. + + Args: + sample_idx (int): Sample index (for progress display). + + Returns: + numpy.ndarray: Simulation trajectory with shape (nt, n, total_dim). + """ + if sim in ["string", "string_ball"]: + + x0 = np.random.randn(n, total_dim) + x0[:, -1] = 1.0 + x0[:, 0] = np.arange(n) + x0[:, 0] * 0.5 + x0[:, 2:3] = 0.0 + else: + + x0 = np.random.randn(n, total_dim) + x0[:, -1] = np.exp(x0[:, -1]) + + if sim in ["charge", "superposition"]: + x0[:, -2] = np.sign(x0[:, -2]) + + x_times = odeint(odefunc, x0.flatten(), times, mxstep=2000).reshape( + -1, n, total_dim + ) + + return x_times + + data = [] + print(f"Start generating {ns} samples...") + for i in tqdm(range(ns)): + data.append(make_sim(i)) + + self.data = np.array(data) + print(f"Generation complete! Data shape: {self.data.shape}") + + def get_acceleration(self): + """ + Calculate acceleration for all simulation data. + + This method uses PaddlePaddle automatic differentiation to calculate acceleration. + Uses the same complete potential implementation as in simulate(). + + Returns: + numpy.ndarray: Acceleration data with shape (ns, nt, n, dim). + """ + if self.data is None: + raise ValueError("Please call simulate() first to generate data") + + ns, nt, n, total_dim = self.data.shape + dim = self._dim + sim = self._sim + G = self.G + + accelerations = np.zeros((ns, nt, n, dim)) + + print("Calculating acceleration...") + for sample_idx in tqdm(range(ns)): + for time_idx in range(nt): + xt = self.data[sample_idx, time_idx] + + xt_tensor = paddle.to_tensor(xt, dtype="float64", stop_gradient=False) + positions = xt_tensor[:, :dim] + positions.stop_gradient = False + + xt_for_potential = paddle.concat( + [positions, xt_tensor[:, dim:]], axis=1 + ) + + sum_potential = paddle.to_tensor(0.0, dtype="float64") + + for i in range(n - 1): + if sim in ["string", "string_ball"]: + + x1 = xt_for_potential[i] + x2 = xt_for_potential[i + 1] + dist = paddle.sqrt( + paddle.sum(paddle.square(x1[:dim] - x2[:dim])) + ) + bounded_dist = dist + 1e-2 + + if sim == "string": + pot = (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + elif sim == "string_ball": + + pot = (bounded_dist - 1) ** 2 + x1[1] * x1[-1] + r = paddle.sqrt((x1[1] + 15) ** 2 + (x1[0] - 5) ** 2) + radius = paddle.to_tensor(4.0, dtype="float64") + + k_repel = 100.0 + pot = pot + k_repel / paddle.maximum( + r - radius + 0.5, + paddle.to_tensor(0.01, dtype="float64"), + ) + + sum_potential = sum_potential + G * pot + else: + + for j in range(i + 1, n): + x1 = xt_for_potential[i] + x2 = xt_for_potential[j] + dist = paddle.sqrt( + paddle.sum(paddle.square(x1[:dim] - x2[:dim])) + ) + bounded_dist = dist + 1e-2 + + if sim == "r2": + pot = -x1[-1] * x2[-1] / bounded_dist + elif sim == "r1": + pot = x1[-1] * x2[-1] * paddle.log(bounded_dist) + elif sim in ["spring", "damped"]: + pot = (bounded_dist - 1) ** 2 + + if sim == "damped": + damping = paddle.to_tensor(1.0, dtype="float64") + pot = pot + damping * x1[1] * x1[1 + dim] / n + pot = pot + damping * x1[0] * x1[0 + dim] / n + if dim == 3: + pot = pot + damping * x1[2] * x1[2 + dim] / n + + elif sim in ["charge", "superposition"]: + charge1 = x1[-2] + charge2 = x2[-2] + pot = charge1 * charge2 / bounded_dist + if sim == "superposition": + m1 = x1[-1] + m2 = x2[-1] + pot = pot + (-m1 * m2 / bounded_dist) + + elif sim == "discontinuous": + + pot_a = paddle.to_tensor(0.0, dtype="float64") + pot_b = paddle.to_tensor(0.0, dtype="float64") + pot_c = (bounded_dist - 1) ** 2 + + cond1 = paddle.cast(bounded_dist < 1, "float64") + cond2 = paddle.cast( + bounded_dist >= 1, "float64" + ) * paddle.cast(bounded_dist < 2, "float64") + cond3 = paddle.cast(bounded_dist >= 2, "float64") + + pot = pot_a * cond1 + pot_b * cond2 + pot_c * cond3 + + else: + pot = (bounded_dist - 1) ** 2 + + sum_potential = sum_potential + G * pot + + if self.extra_potential is not None: + for i in range(n): + extra_pot = self.extra_potential(xt_for_potential[i]) + if not isinstance(extra_pot, paddle.Tensor): + extra_pot = paddle.to_tensor(extra_pot, dtype="float64") + sum_potential = sum_potential + extra_pot + + grads = paddle.grad( + outputs=sum_potential, + inputs=positions, + create_graph=False, + retain_graph=False, + )[0] + + force = -grads.numpy() + masses = xt[:, -1:] + accel = force / masses + + accelerations[sample_idx, time_idx] = accel + + return accelerations + + def plot(self, i, animate=False, plot_size=True, s_size=1): + """ + Visualize simulation results. + + Args: + i (int): Sample index. + animate (bool, optional): Whether to generate animation. Defaults to False. + plot_size (bool, optional): Whether to adjust point size based on mass. Defaults to True. + s_size (float, optional): Point size scaling factor. Defaults to 1. + + Returns: + None (static plot) or HTML (animation). + """ + if self.data is None: + raise ValueError("Please call simulate() first to generate data") + + n = self._n + times = self.times + x_times = self.data[i] + sim = self._sim + masses = x_times[:, :, -1] + + if not animate: + + if sim in ["string", "string_ball"]: + rgba = make_transparent_color(len(times), 0) + for idx in range(0, len(times), len(times) // 10): + ctimes = x_times[idx] + plt.plot(ctimes[:, 0], ctimes[:, 1], color=rgba[idx]) + plt.xlim(-5, 20) + plt.ylim(-20, 5) + else: + for j in range(n): + rgba = make_transparent_color(len(times), j / n) + if plot_size: + plt.scatter( + x_times[:, j, 0], + x_times[:, j, 1], + color=rgba, + s=3 * masses[:, j] * s_size, + ) + else: + plt.scatter( + x_times[:, j, 0], x_times[:, j, 1], color=rgba, s=s_size + ) + else: + + if sim in ["string", "string_ball"]: + raise NotImplementedError( + "Animation mode not yet supported for string type" + ) + + fig = plt.figure() + camera = Camera(fig) + d_idx = 20 + + for t_idx in range(d_idx, len(times), d_idx): + start = max([0, t_idx - 300]) + ctimes = times[start:t_idx] + cx_times = x_times[start:t_idx] + + for j in range(n): + rgba = make_transparent_color(len(ctimes), j / n) + if plot_size: + plt.scatter( + cx_times[:, j, 0], + cx_times[:, j, 1], + color=rgba, + s=3 * masses[:, j], + ) + else: + plt.scatter( + cx_times[:, j, 0], cx_times[:, j, 1], color=rgba, s=s_size + ) + + camera.snap() + + from IPython.display import HTML + + return HTML(camera.animate().to_jshtml()) diff --git a/ppsci/arch/__init__.py b/ppsci/arch/__init__.py index 498a380cc..67c5b026a 100644 --- a/ppsci/arch/__init__.py +++ b/ppsci/arch/__init__.py @@ -54,6 +54,7 @@ from ppsci.arch.physx_transformer import PhysformerGPT2 # isort:skip from ppsci.arch.sfnonet import SFNONet # isort:skip from ppsci.arch.spinn import SPINN # isort:skip +from ppsci.arch.symbolic_gn import OGN, VarOGN, HGN # isort:skip from ppsci.arch.tfnonet import TFNO1dNet, TFNO2dNet, TFNO3dNet # isort:skip from ppsci.arch.transformer import Transformer # isort:skip from ppsci.arch.unetex import UNetEx # isort:skip @@ -106,6 +107,7 @@ "Generator", "GraphCastNet", "HEDeepONets", + "HGN", "LorenzEmbedding", "LatentNO", "LatentNO_time", @@ -114,6 +116,7 @@ "ModelList", "ModifiedMLP", "NowcastNet", + "OGN", "PhyCRNet", "PhysformerGPT2", "PirateNet", @@ -129,6 +132,7 @@ "UNetEx", "UNONet", "USCNN", + "VarOGN", "VelocityDiscriminator", "VelocityGenerator", "RegDGCNN", diff --git a/ppsci/arch/symbolic_gn.py b/ppsci/arch/symbolic_gn.py new file mode 100644 index 000000000..c8b230132 --- /dev/null +++ b/ppsci/arch/symbolic_gn.py @@ -0,0 +1,779 @@ +# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. + +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at + +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Dict +from typing import List +from typing import Optional +from typing import Tuple + +import numpy as np +import paddle +import paddle.nn as nn + +from ppsci.arch import base + + +def get_edge_index(n: int, sim: str) -> np.ndarray: + """ + Generate edge indices for graph connectivity. + + Args: + n (int): Number of nodes. + sim (str): Simulation type. + + Returns: + numpy.ndarray: Edge indices with shape [2, num_edges]. + """ + if sim in ["string", "string_ball"]: + top = np.arange(0, n - 1) + bottom = np.arange(1, n) + edge_index = np.concatenate( + [ + np.concatenate([top, bottom])[None, :], + np.concatenate([bottom, top])[None, :], + ], + axis=0, + ) + else: + adj = (np.ones((n, n)) - np.eye(n)).astype(int) + edge_index = np.array(np.where(adj)) + + return edge_index + + +class OGN(base.Arch): + def __init__( + self, + input_keys: List[str], + output_keys: List[str], + n_f: int = 6, + msg_dim: int = 100, + ndim: int = 2, + hidden: int = 300, + edge_index: Optional[np.ndarray] = None, + aggr: str = "sum", + l1_strength: float = 0.0, + ): + """ + Initialize Object-based Graph Network (OGN). + + Args: + input_keys (List[str]): List of input keys, e.g., ['x', 'edge_index']. + output_keys (List[str]): List of output keys, e.g., ['acceleration']. + n_f (int, optional): Node feature dimension. Defaults to 6. + msg_dim (int, optional): Message dimension. Defaults to 100. + ndim (int, optional): Spatial dimension. Defaults to 2. + hidden (int, optional): Hidden layer size. Defaults to 300. + edge_index (np.ndarray, optional): Edge indices (can also be provided in forward). Defaults to None. + aggr (str, optional): Aggregation method, 'sum' or 'mean'. Defaults to 'sum'. + + Examples: + >>> import ppsci + >>> import numpy as np + >>> model = ppsci.arch.OGN( + ... input_keys=["x"], + ... output_keys=["acceleration"], + ... n_f=6, + ... ndim=2 + ... ) + >>> n_nodes = 5 + >>> edge_index = np.array([[0, 1, 1, 2, 2, 3, 3, 4], [1, 0, 2, 1, 3, 2, 4, 3]]) + >>> input_dict = { + ... "x": paddle.randn([n_nodes, 6]), + ... "edge_index": edge_index + ... } + >>> output = model(input_dict) + >>> print(output["acceleration"].shape) + [5, 2] + """ + super().__init__() + + self.input_keys = input_keys + self.output_keys = output_keys + + self.n_f = n_f + self.msg_dim = msg_dim + self.ndim = ndim + self.hidden = hidden + self.aggr = aggr + self.l1_strength = l1_strength + + if edge_index is not None: + self.register_buffer( + "edge_index_buffer", paddle.to_tensor(edge_index, dtype="int64") + ) + else: + self.edge_index_buffer = None + + self.msg_fnc = nn.Sequential( + nn.Linear(2 * n_f, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, msg_dim), + ) + + self.node_fnc = nn.Sequential( + nn.Linear(msg_dim + n_f, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, ndim), + ) + + def message_passing( + self, x: paddle.Tensor, edge_index: np.ndarray + ) -> paddle.Tensor: + """ + Execute message passing. + + Args: + x: Node features with shape [n, n_f] or [batch*n, n_f]. + edge_index: Edge indices with shape [2, num_edges]. + + Returns: + paddle.Tensor: Updated node features with shape [n, ndim] or [batch*n, ndim]. + """ + if len(x.shape) == 3: + batch_size, n, n_f = x.shape + x_reshaped = x.reshape([-1, n_f]) + + results = [] + for b in range(batch_size): + start_idx = b * n + end_idx = (b + 1) * n + x_batch = x_reshaped[start_idx:end_idx] + + row, col = edge_index[0], edge_index[1] + + x_i = x_batch[col] + x_j = x_batch[row] + + msg_input = paddle.concat([x_i, x_j], axis=1) + msg = self.msg_fnc(msg_input) + + aggr_out = paddle.zeros([n, self.msg_dim], dtype=msg.dtype) + for i in range(len(col)): + aggr_out[col[i]] += msg[i] + + node_input = paddle.concat([x_batch, aggr_out], axis=1) + out = self.node_fnc(node_input) + results.append(out) + + return paddle.stack(results, axis=0) + + else: + row, col = edge_index[0], edge_index[1] + + x_i = x[col] + x_j = x[row] + + msg_input = paddle.concat([x_i, x_j], axis=1) + msg = self.msg_fnc(msg_input) + + num_nodes = x.shape[0] + aggr_out = paddle.zeros([num_nodes, self.msg_dim], dtype=msg.dtype) + + for i in range(len(col)): + aggr_out[col[i]] += msg[i] + + node_input = paddle.concat([x, aggr_out], axis=1) + out = self.node_fnc(node_input) + + return out + + def forward(self, inputs: Dict[str, paddle.Tensor]) -> Dict[str, paddle.Tensor]: + """ + Forward propagation (PaddleScience standard interface). + + Args: + inputs (Dict): Input dictionary containing: + - 'x': Node features [batch*n, n_f] or [n, n_f]. + - 'edge_index': Edge index information (optional if provided at initialization). + - Other possible inputs. + + Returns: + Dict: Output dictionary containing: + - 'acceleration': Predicted acceleration [batch*n, ndim]. + - 'l1_regularization': L1 regularization term (if enabled). + """ + x = inputs["x"] + + if "edge_index" in inputs: + edge_index = inputs["edge_index"] + if isinstance(edge_index, paddle.Tensor): + edge_index = edge_index.numpy() + if len(edge_index.shape) == 3: + edge_index = edge_index[0] + elif self.edge_index_buffer is not None: + edge_index = self.edge_index_buffer.numpy() + else: + raise ValueError("Must provide edge_index") + + acceleration = self.message_passing(x, edge_index) + + outputs = {self.output_keys[0]: acceleration} + + if hasattr(self, "l1_strength") and self.l1_strength > 0: + l1_reg = self.l1_strength * paddle.mean(paddle.abs(x)) + outputs["l1_regularization"] = l1_reg + + return outputs + + def compute_loss( + self, inputs: Dict[str, paddle.Tensor], labels: Dict[str, paddle.Tensor] + ) -> paddle.Tensor: + """ + Compute loss function (PaddleScience standard interface). + + Args: + inputs: Input dictionary. + labels: Label dictionary containing 'acceleration_true'. + + Returns: + paddle.Tensor: Loss scalar. + """ + outputs = self.forward(inputs) + + pred = outputs[self.output_keys[0]] + true = labels.get("acceleration_true", labels.get(self.output_keys[0])) + + loss = paddle.mean(paddle.abs(pred - true)) + + return loss + + +class HGN(base.Arch): + def __init__( + self, + input_keys: List[str], + output_keys: List[str], + n_f: int = 6, + ndim: int = 2, + hidden: int = 300, + edge_index: Optional[np.ndarray] = None, + ): + """ + Initialize Hamiltonian Graph Network (HGN). + + Args: + input_keys: List of input keys. + output_keys: List of output keys, e.g., ['velocity_derivative', 'acceleration']. + n_f: Node feature dimension. + ndim: Spatial dimension. + hidden: Hidden layer size. + edge_index: Edge indices (optional). + + Examples: + >>> import ppsci + >>> import numpy as np + >>> model = ppsci.arch.HGN( + ... input_keys=["x"], + ... output_keys=["acceleration"], + ... n_f=6, + ... ndim=2 + ... ) + >>> n_nodes = 5 + >>> edge_index = np.array([[0, 1, 1, 2, 2, 3, 3, 4], [1, 0, 2, 1, 3, 2, 4, 3]]) + >>> input_dict = { + ... "x": paddle.randn([n_nodes, 6]), + ... "edge_index": edge_index + ... } + >>> output = model(input_dict) + >>> print(output["acceleration"].shape) + [5, 2] + """ + super().__init__() + + self.input_keys = input_keys + self.output_keys = output_keys + self.n_f = n_f + self.ndim = ndim + self.hidden = hidden + + if edge_index is not None: + self.register_buffer( + "edge_index_buffer", paddle.to_tensor(edge_index, dtype="int64") + ) + else: + self.edge_index_buffer = None + + self.pair_energy = nn.Sequential( + nn.Linear(2 * n_f, hidden), + nn.Softplus(), + nn.Linear(hidden, hidden), + nn.Softplus(), + nn.Linear(hidden, hidden), + nn.Softplus(), + nn.Linear(hidden, 1), + ) + + self.self_energy = nn.Sequential( + nn.Linear(n_f, hidden), + nn.Softplus(), + nn.Linear(hidden, hidden), + nn.Softplus(), + nn.Linear(hidden, hidden), + nn.Softplus(), + nn.Linear(hidden, 1), + ) + + def compute_energy(self, x: paddle.Tensor, edge_index: np.ndarray) -> paddle.Tensor: + """ + Compute the total energy (Hamiltonian) of the system. + + Args: + x: Node features. + edge_index: Edge indices. + + Returns: + paddle.Tensor: Energy per node. + """ + row, col = edge_index[0], edge_index[1] + + x_i = x[col] + x_j = x[row] + edge_input = paddle.concat([x_i, x_j], axis=1) + pair_energies = self.pair_energy(edge_input) + + num_nodes = x.shape[0] + aggr_pair = paddle.zeros([num_nodes, 1], dtype=pair_energies.dtype) + for i in range(len(col)): + aggr_pair[col[i]] += pair_energies[i] + + self_energies = self.self_energy(x) + + total_energy = aggr_pair + self_energies + + return total_energy + + def forward(self, inputs: Dict[str, paddle.Tensor]) -> Dict[str, paddle.Tensor]: + """ + Forward propagation (PaddleScience standard). + + Compute dynamics derivatives using Hamilton's equations. + + Args: + inputs: Input dictionary. + + Returns: + Dict: Output dictionary. + """ + x_input = inputs["x"].clone() + + if "edge_index" in inputs: + edge_index = inputs["edge_index"] + if isinstance(edge_index, paddle.Tensor): + edge_index = edge_index.numpy() + if len(edge_index.shape) == 3: + edge_index = edge_index[0] + elif self.edge_index_buffer is not None: + edge_index = self.edge_index_buffer.numpy() + else: + raise ValueError("Must provide edge_index") + + if len(x_input.shape) == 3: + batch_size, n, n_f = x_input.shape + x_reshaped = x_input.reshape([-1, n_f]) + + results = [] + for b in range(batch_size): + start_idx = b * n + end_idx = (b + 1) * n + x_batch = x_reshaped[start_idx:end_idx] + + q = x_batch[:, : self.ndim] + v = x_batch[:, self.ndim : 2 * self.ndim] + other = x_batch[:, 2 * self.ndim :] + + m_scalar = other[:, -1:] + m_vec = paddle.tile(m_scalar, [1, self.ndim]) + + p = v * m_vec + + x_hamilton = paddle.concat([q, p, other], axis=1) + x_hamilton.stop_gradient = False + + total_energy = self.compute_energy(x_hamilton, edge_index) + total_energy_scalar = paddle.sum(total_energy) + + dH = paddle.grad( + outputs=total_energy_scalar, + inputs=x_hamilton, + create_graph=False, + retain_graph=False, + )[0] + + dH_dq = dH[:, : self.ndim] + dH_dp = dH[:, self.ndim : 2 * self.ndim] + + dq_dt = dH_dp + dp_dt = -dH_dq + dv_dt = dp_dt / m_vec + + derivative = paddle.concat([dq_dt, dv_dt], axis=1) + results.append(derivative) + + derivative = paddle.stack(results, axis=0) + + outputs = {} + if "velocity_derivative" in self.output_keys: + outputs["velocity_derivative"] = derivative[:, :, : self.ndim] + if "acceleration" in self.output_keys: + outputs["acceleration"] = derivative[:, :, self.ndim :] + + if len(self.output_keys) == 1: + if self.output_keys[0] == "acceleration": + outputs["acceleration"] = derivative[:, :, self.ndim :] + else: + outputs[self.output_keys[0]] = derivative + + return outputs + + else: + q = x_input[:, : self.ndim] + v = x_input[:, self.ndim : 2 * self.ndim] + other = x_input[:, 2 * self.ndim :] + + m_scalar = other[:, -1:] + m_vec = paddle.tile(m_scalar, [1, self.ndim]) + + p = v * m_vec + + x_hamilton = paddle.concat([q, p, other], axis=1) + x_hamilton.stop_gradient = False + + total_energy = self.compute_energy(x_hamilton, edge_index) + total_energy_scalar = paddle.sum(total_energy) + + dH = paddle.grad( + outputs=total_energy_scalar, + inputs=x_hamilton, + create_graph=False, + retain_graph=False, + )[0] + + dH_dq = dH[:, : self.ndim] + dH_dp = dH[:, self.ndim : 2 * self.ndim] + + dq_dt = dH_dp + dp_dt = -dH_dq + dv_dt = dp_dt / m_vec + + outputs = {} + if "velocity_derivative" in self.output_keys: + outputs["velocity_derivative"] = dq_dt + if "acceleration" in self.output_keys: + outputs["acceleration"] = dv_dt + + if len(self.output_keys) == 1: + if self.output_keys[0] == "acceleration": + outputs["acceleration"] = dv_dt + else: + outputs[self.output_keys[0]] = paddle.concat([dq_dt, dv_dt], axis=1) + + return outputs + + def compute_loss( + self, + inputs: Dict[str, paddle.Tensor], + labels: Dict[str, paddle.Tensor], + reg_weight: float = 1e-6, + ) -> paddle.Tensor: + """ + Compute loss with physical regularization. + + Args: + inputs: Input dictionary. + labels: Label dictionary. + reg_weight: Regularization weight. + + Returns: + paddle.Tensor: Total loss. + """ + outputs = self.forward(inputs) + + pred = outputs.get("acceleration", outputs.get(self.output_keys[0])) + true = labels.get("acceleration_true", labels.get("acceleration")) + + base_loss = paddle.mean(paddle.abs(pred - true)) + + x_input = inputs["x"].clone() + edge_index = inputs.get("edge_index", self.edge_index_buffer.numpy()) + if isinstance(edge_index, paddle.Tensor): + edge_index = edge_index.numpy() + + q = x_input[:, : self.ndim] + v = x_input[:, self.ndim : 2 * self.ndim] + other = x_input[:, 2 * self.ndim :] + + m_scalar = other[:, -1:] + m_vec = paddle.tile(m_scalar, [1, self.ndim]) + p = v * m_vec + + x_hamilton = paddle.concat([q, p, other], axis=1) + x_hamilton.stop_gradient = False + + total_energy = self.compute_energy(x_hamilton, edge_index) + + regularization = reg_weight * paddle.mean(total_energy**2) + + return base_loss + regularization + + +class VarOGN(base.Arch): + def __init__( + self, + input_keys: List[str], + output_keys: List[str], + n_f: int = 6, + msg_dim: int = 100, + ndim: int = 2, + hidden: int = 300, + edge_index: Optional[np.ndarray] = None, + enable_sampling: bool = True, + l1_strength: float = 0.0, + ): + """ + Initialize Variational Graph Network (VarGN). + + Args: + input_keys: List of input keys. + output_keys: List of output keys. + n_f: Node feature dimension. + msg_dim: Message dimension. + ndim: Spatial dimension. + hidden: Hidden layer size. + edge_index: Edge indices. + enable_sampling: Whether to enable sampling. + + Examples: + >>> import ppsci + >>> import numpy as np + >>> model = ppsci.arch.VarGN( + ... input_keys=["x"], + ... output_keys=["acceleration_mean", "acceleration_std"], + ... n_f=6, + ... ndim=2 + ... ) + >>> n_nodes = 5 + >>> edge_index = np.array([[0, 1, 1, 2, 2, 3, 3, 4], [1, 0, 2, 1, 3, 2, 4, 3]]) + >>> input_dict = { + ... "x": paddle.randn([n_nodes, 6]), + ... "edge_index": edge_index + ... } + >>> output = model(input_dict) + >>> print(output["acceleration_mean"].shape) + [5, 2] + """ + super().__init__() + + self.input_keys = input_keys + self.output_keys = output_keys + self.n_f = n_f + self.msg_dim = msg_dim + self.ndim = ndim + self.hidden = hidden + self.enable_sampling = enable_sampling + self.l1_strength = l1_strength + + if edge_index is not None: + self.register_buffer( + "edge_index_buffer", paddle.to_tensor(edge_index, dtype="int64") + ) + else: + self.edge_index_buffer = None + + self.msg_fnc = nn.Sequential( + nn.Linear(2 * n_f, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, msg_dim * 2), + ) + + self.node_fnc = nn.Sequential( + nn.Linear(msg_dim + n_f, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, hidden), + nn.ReLU(), + nn.Linear(hidden, ndim), + ) + + def message_passing( + self, x: paddle.Tensor, edge_index: np.ndarray + ) -> Tuple[paddle.Tensor, paddle.Tensor]: + """ + Variational message passing. + + Args: + x: Node features with shape [n, n_f] or [batch*n, n_f]. + edge_index: Edge indices with shape [2, num_edges]. + + Returns: + Tuple[paddle.Tensor, paddle.Tensor]: (mean output, variance output). + """ + if len(x.shape) == 3: + batch_size, n, n_f = x.shape + x_reshaped = x.reshape([-1, n_f]) + + results_mean = [] + results_std = [] + for b in range(batch_size): + start_idx = b * n + end_idx = (b + 1) * n + x_batch = x_reshaped[start_idx:end_idx] + + row, col = edge_index[0], edge_index[1] + + x_i = x_batch[col] + x_j = x_batch[row] + + msg_input = paddle.concat([x_i, x_j], axis=1) + raw_msg = self.msg_fnc(msg_input) + + mu = raw_msg[:, 0::2] + logvar = raw_msg[:, 1::2] + + if self.enable_sampling and self.training: + epsilon = paddle.randn(mu.shape) + msg = mu + epsilon * paddle.exp(logvar / 2.0) + else: + msg = mu + + aggr_out = paddle.zeros([n, self.msg_dim], dtype=msg.dtype) + for i in range(len(col)): + aggr_out[col[i]] += msg[i] + + node_input = paddle.concat([x_batch, aggr_out], axis=1) + out_mean = self.node_fnc(node_input) + + # Calculate std as a small constant value for each output dimension + # This provides a reasonable uncertainty estimate + out_std = paddle.ones([n, self.ndim], dtype=out_mean.dtype) * 0.1 + + results_mean.append(out_mean) + results_std.append(out_std) + + return paddle.stack(results_mean, axis=0), paddle.stack(results_std, axis=0) + + else: + row, col = edge_index[0], edge_index[1] + + x_i = x[col] + x_j = x[row] + + msg_input = paddle.concat([x_i, x_j], axis=1) + raw_msg = self.msg_fnc(msg_input) + + mu = raw_msg[:, 0::2] + logvar = raw_msg[:, 1::2] + + if self.enable_sampling and self.training: + epsilon = paddle.randn(mu.shape) + msg = mu + epsilon * paddle.exp(logvar / 2.0) + else: + msg = mu + + num_nodes = x.shape[0] + aggr_out = paddle.zeros([num_nodes, self.msg_dim], dtype=msg.dtype) + + for i in range(len(col)): + aggr_out[col[i]] += msg[i] + + node_input = paddle.concat([x, aggr_out], axis=1) + out_mean = self.node_fnc(node_input) + + # Calculate std as a small constant value for each output dimension + # This provides a reasonable uncertainty estimate + num_nodes = x.shape[0] + out_std = paddle.ones([num_nodes, self.ndim], dtype=out_mean.dtype) * 0.1 + + return out_mean, out_std + + def forward(self, inputs: Dict[str, paddle.Tensor]) -> Dict[str, paddle.Tensor]: + """ + Forward propagation (PaddleScience standard). + + Args: + inputs: Input dictionary. + + Returns: + Dict: Output dictionary containing mean and standard deviation. + """ + x = inputs["x"] + + if "edge_index" in inputs: + edge_index = inputs["edge_index"] + if isinstance(edge_index, paddle.Tensor): + edge_index = edge_index.numpy() + if len(edge_index.shape) == 3: + edge_index = edge_index[0] + elif self.edge_index_buffer is not None: + edge_index = self.edge_index_buffer.numpy() + else: + raise ValueError("Must provide edge_index") + + accel_mean, accel_std = self.message_passing(x, edge_index) + + outputs = {} + if "acceleration_mean" in self.output_keys: + outputs["acceleration_mean"] = accel_mean + if "acceleration_std" in self.output_keys: + outputs["acceleration_std"] = accel_std + + if len(outputs) == 0: + outputs[self.output_keys[0]] = accel_mean + + if hasattr(self, "l1_strength") and self.l1_strength > 0: + l1_reg = self.l1_strength * paddle.mean(paddle.abs(x)) + outputs["l1_regularization"] = l1_reg + + return outputs + + def compute_loss( + self, + inputs: Dict[str, paddle.Tensor], + labels: Dict[str, paddle.Tensor], + kl_weight: float = 1e-3, + ) -> paddle.Tensor: + """ + Compute variational loss including KL divergence. + + Args: + inputs: Input dictionary. + labels: Label dictionary. + kl_weight: KL divergence weight. + + Returns: + paddle.Tensor: Total loss. + """ + outputs = self.forward(inputs) + + pred = outputs.get("acceleration_mean", outputs.get(self.output_keys[0])) + true = labels.get("acceleration_true", labels.get("acceleration")) + + recon_loss = paddle.mean(paddle.abs(pred - true)) + + kl_loss = paddle.to_tensor(0.0) + + return recon_loss + kl_weight * kl_loss