Skip to content

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Jun 14, 2025

This is an initial implementation for gauge fixing a PEPS based on the algorithm described here

From what I can tell it is mostly functional, although it requires some tests etc to actually see how well it performs in practice.
The idea would be to attempt to stabilize some gradient methods by re-gauging the state every N steps.

TODO:

  • Support for fermions.
  • Enforce positive semi-definiteness of message matrices.
  • Allow any virtual leg arrow directions in the PEPS.
  • Rotation (rotl90 etc.) of BPEnv.
  • Conversion from BPEnv to CTMRGEnv and SUWeight.

@codecov
Copy link

codecov bot commented Jun 14, 2025

Codecov Report

❌ Patch coverage is 86.02941% with 38 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/algorithms/contractions/bp_contractions.jl 86.13% 14 Missing ⚠️
src/environments/bp_environments.jl 82.27% 14 Missing ⚠️
src/algorithms/bp/gaugefix.jl 82.92% 7 Missing ⚠️
src/algorithms/bp/beliefpropagation.jl 94.00% 3 Missing ⚠️
Files with missing lines Coverage Δ
src/PEPSKit.jl 100.00% <ø> (ø)
src/networks/local_sandwich.jl 92.85% <100.00%> (ø)
src/algorithms/bp/beliefpropagation.jl 94.00% <94.00%> (ø)
src/algorithms/bp/gaugefix.jl 82.92% <82.92%> (ø)
src/algorithms/contractions/bp_contractions.jl 86.13% <86.13%> (ø)
src/environments/bp_environments.jl 82.27% <82.27%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Jun 15, 2025

I remember that BP is equivalent to simple update (PhysRevResearch.3.023073), although I haven't looked at the details (such as whether the bond messages are guaranteed to be real diagonal, and thus already represented by SUWeight). We may use this equivalence to avoid introducing redundant structs, or generalize the existing simple update algorithm.

BTW, YASTN people are also working on BP, and with NNN Hamiltonian support (in one of the branches) as well.

@lkdvos
Copy link
Member Author

lkdvos commented Jun 15, 2025

I looked into trying to reuse some of that functionality, and at least for the BPEnv struct there really are substantial differences: note how there are 2 messages per edge, one going in each direction, which are connected between bra and ket, instead of between the virtual edges of the PEPS. That's why I ended up modeling this more like the CTMRGEnv, where I'm effectively storing a CTMRG environment with chi = 1 trivial legs. (I'll try add a converter/promoter from BPEnv to CTMRGEnv to at some point.)

I do agree that it would be nice to join the simple update part, but for now I was actually mostly interested in affecting the gauge of a PEPS, and we can look into extending this to actually do updates in the future.

It's nice to hear that YASTN also are finding this helpful! I don't think adding NNN support would be too hard, the contractions with these rank 1 environments are a lot more manageable in general.
Something I would like to look into at some point is improving the accuracy with loop corrections, see https://arxiv.org/pdf/2409.03108

@pbrehmer
Copy link
Collaborator

Thanks for getting this started! I'd be happy to join in on this since I was recently talking to Bram about PEPS optimization, how well it might perform and how that might be related to gauge freedoms. Specifically, I was looking at optimization runs that get stuck in linesearching at some point - e.g. when choosing a CTMRG environment dimension that is too low. Around these iterations where the optimizer gets stuck I computed cuts through the energy landscape along the current gradient directions and I was getting all sorts of weird curves with cusps or discontinuities (inspired by Appendix B of the periodic PEPS paper: https://arxiv.org/pdf/2411.12731).

For example, a Heisenberg model at $D=3$ and $\chi=4$ will look like this ($i$ being the LBFGS iteration, $g$ the gradient at iteration $i$):
image

At $D=3$ and $\chi=8$ it looks a bit more wild:
image

In these cases you can really see that a low environment dimension will first get you smooth convex functions that become more closely centered around $\alpha=0$ and then cusps and discontinuities may appear. In these cases the linesearching algorithms of course run into trouble and eventually may spit out unphysical optimization steps. Also, this process of breaking down might stretch over many LBFGS iterations.

I seemed that gauging by symmetrizing the PEPS spatially did improve the energy landscapes but I haven't properly investigated that. In any case, I would be quite interested to see how BP gauging could improve optimization convergence and perhaps stabilize more complicated optimization runs.

@Yue-Zhengyuan
Copy link
Member

Something I would like to look into at some point is improving the accuracy with loop corrections, see https://arxiv.org/pdf/2409.03108

I also have wanted to try it out for quite a while, and see how it can systemically improve over SU.

To suggest some tests for the current PR, we can first check that if the bond weights produced by SU (with identity gates) are consistent with the BP message tensors.

@lkdvos
Copy link
Member Author

lkdvos commented Jun 16, 2025

@pbrehmer these are some really cool plots! This was indeed one of the primary use-cases I had in mind, so it would be cool to see how this alters these gradient plots.

As a side note, at such low bond dimensions I would also be slightly cautious about picking bond dimensions that necessarily break the symmetry, for example in a U1 case with charge conjugation I would expect that you might need to check for each D whether or not it gives a valid combination. This is motivated by similar problems for MPS simulations, where certain bond dimensions don't work because they have to cut inbetween multiplets.

@GlebFedorovich
Copy link

GlebFedorovich commented Jun 17, 2025

@pbrehmer Indeed, we have seen this a lot for periodic PEPS, and pushing $$\chi$$ and imposing some rotation and mirror symmetries help to stabilize the optimization and avoid these discontinuous points. However, there are many cases, beyond ising and heisenberg models, when imposing spatial symmetries leads to higher energies compared let's say with just SU (or with AD fixed-point one before the optimization arrives at "bad" point :) )

@pbrehmer
Copy link
Collaborator

However, there are many cases, beyond ising and heisenberg models, when imposing spatial symmetries leads to higher energies compared let's say with just SU (or with AD fixed-point one before the optimization arrives at "bad" point :) )

Yes I have also noticed that recently in some cases! Good to know that that seems to be consistent with your findings :)

Copy link
Member

@Yue-Zhengyuan Yue-Zhengyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since adding BP may need some revisions for SU and InfiniteWeightPEPS, I want to ask a few things to get more familiar with it.

@kshyatt
Copy link
Member

kshyatt commented Nov 14, 2025

Resolved the src/PEPSKit.jl conflict if anyone wants to pick this back up

@github-actions
Copy link
Contributor

github-actions bot commented Nov 14, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

@kshyatt
Copy link
Member

kshyatt commented Nov 27, 2025

Should the doc build here be fixed by #301?

@leburgel
Copy link
Member

Should the doc build here be fixed by #301?

Yes, should be fixed now!

Copy link
Member Author

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing working on this, looks nice!

I have a couple things that I would like to bring up, that I still remember from when I was implementing this.

The first is that because of the layout of the algorithm, at some point I was wondering if it would actually make sense to have the south and west messages defined as the adjoint of what they are right now. The idea being the same as how typically for left/right eigenvectors we write A * R and L' * A, and I somehow had in my head that this would make things a bit more symmetric. Additionally, I think that means that for standard arrows no twists appear, while otherwise that would be the case. (although I have to double check this)

The second is that I need to double check if I implemented the efficient stopping criterion, rather than simply playing around with the default slow one.

I'm not a huge fan of the updated convention for the coordinates, since I find it a bit unintuitive if I just consider the algorithm itself, but I understand that the similarity to CTMRGEnv is also important.

Finally, I don't remember what we decided in the end, but I would not be opposed to gaugefixing resulting in a state that has different arrows than the original one. We can still add in convenience flipping functionality to fix that afterwards, but I am not a big fan of always paying that price. (This is also how MPSKit works, this will change the arrows to the canonical convention)

I should have some time next week to look into this myself as well though.

@Yue-Zhengyuan
Copy link
Member

it would actually make sense to have the south and west messages defined as the adjoint of what they are right now

If we are going to enforce the messages to be semi-positive definite, I think taking adjoints won't make any differences?

@lkdvos lkdvos self-assigned this Dec 4, 2025
@lkdvos lkdvos marked this pull request as ready for review December 4, 2025 22:49
@Yue-Zhengyuan
Copy link
Member

For fermions, do we also take the approach that first flipping virtual arrows (for both the state and the BPEnv) to leftwards/downwards, then finding BP fixed point and gauge transformation, and finally flipping back?

@lkdvos
Copy link
Member Author

lkdvos commented Dec 5, 2025

So, I pushed some more changes to clean the code up a bit, and in particular made sure everything is type stable now.
There are a couple questions I have left, which I want to pose here but would also be fine leaving for later:

  • In the gauging procedure, the current choice of all messages being top <- bot is what makes it such that we require additional transpose functions there. This could be fine, but is slightly unfortunate since this cannot in general be done without re-allocating. It might be more convenient in the long run to have the messages follow the "canonical arrow" convention, such that north and east we get top <- bot and south and west we get bot <- top. Alternatively, if we really want all of them to have the same layout (which does make rotations a bit easier), I would even consider having the north and east top <- bot, and the south and west be adjoint(bot <- top) = (top <- bot). This is equivalent to defining the south and west messages as the conj of what they are now. Ultimately, this probably never shows up in a profiler though.
  • Since I did the gauging procedure in one go now, I no longer really have the need of the SUWeight intermediate step, so I'm also not outputting this. I was trying to consider bringing back SUWeight(::BPEnv), but I have to say that the choice of having weights that follow the arrows instead of with a fixed direction is really unintuitive to me, so I simply left this out for now. Similarly, having a more detailed look I am very surprised that the flip_svd function doesn't only flip the arrows but also transposes S. Is this something we could rediscuss? edit: I definitely like the approach of just flipping before and after the procedure.
  • We probably will have to tinker a bit more with the BeliefPropagation struct in the near future, since there are a bunch of settings missing. Thinking of pseudo-inverse cutoffs, SVD/Eig algorithms, but also of "schemes" for doing the BP iterations. Currently we are updating all messages in parallel, while in principle we could snake or alternate updates which might be more efficient for larger unit cells.
  • I vaguely remember discussing for SU whether or not we want to store the weights, or their square roots. Is this something we would like to revisit?

@lkdvos lkdvos changed the title [WIP] Belief propagation gauge fixing Belief propagation gauge fixing Dec 5, 2025
@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Dec 5, 2025

Here we have at least one more choices of what should be the "canonical" arrow direction: same as the bond message orientation, i.e. a message to tensor A from tensor B has axis order (leg connected to A, leg connected to B).

Here we are using this choice, which I think is more natural. Although it requires transposing during gauging, I think it makes the BP equations (in bp_contractions.jl) simpler. Otherwise implicit permutes (or transposes) are needed in BP update @tensor calls, which is more error-prone for fermions.

For SUWeight, I do want to make its axis order independent of virtual arrow directions. But it is better to deal with it in a follow-up PR. The main point of the current design is to avoid minus signs introduced by flips (which is canceled by a extra permutation that reverses the axis order), which makes taking square roots a bit more involved.

A similar problem exists here for fermions when we initialize the messages with identity operators. For our current choice of axis order, I feel that for some messages, elements in the odd sector should be -1, i.e. the message is the permuted identity operator. This can be seen by requiring the trivial BPEnv to be the parity matrices needed to cancel the twists when evaluating the inner product of a local patch of two iPEPS, interpreting the open virtual bonds on the boundary of the patch as physical ones.

@lkdvos
Copy link
Member Author

lkdvos commented Dec 5, 2025

Here we are using this choice, which I think is more natural. Although it requires transposing during gauging, I think it makes the BP equations (in bp_contractions.jl) simpler. Otherwise implicit permutes (or transposes) are needed in BP update @tensor calls, which is more error-prone for fermions.

I'm pretty sure that the implicit permutes shouldn't affect anything, at least that's what the @tensor macro should take care of. My suggestion for the `adjoint definition would result in the following contractions by the way:

Details
function contract_north_message(
        A::PEPSSandwich, M_west::PEPSMessage, M_north::PEPSMessage, M_east::PEPSMessage
    )
    return @autoopt @tensor begin
        M_north′[DSt; DSb] :=
            ket(A)[d; DNt DEt DSt DWt] * conj(bra(A)[d; DNb DEb DSb DWb]) *
            conj(M_west[DWt; DWb]) * M_north[DNt; DNb] * M_east[DEt; DEb]
    end
end
function contract_east_message(
        A::PEPSSandwich, M_north::PEPSMessage, M_east::PEPSMessage, M_south::PEPSMessage
    )
    return @autoopt @tensor begin
        M_east′[DWt; DWb] :=
            ket(A)[d; DNt DEt DSt DWt] * conj(bra(A)[d; DNb DEb DSb DWb]) *
            M_north[DNt; DNb] * M_east[DEt; DEb] * conj(M_south[DSt; DSb])
    end
end
function contract_south_message(
        A::PEPSSandwich, M_east::PEPSMessage, M_south::PEPSMessage, M_west::PEPSMessage
    )
    return @autoopt @tensor begin
        M_south′[DNt; DNb] :=
            conj(ket(A)[d; DNt DEt DSt DWt]) * bra(A)[d; DNb DEb DSb DWb] *
            conj(M_east[DEt; DEb]) * M_south[DSt; DSb] * M_west[DWt; DWb]
    end
end
function contract_west_message(
        A::PEPSSandwich, M_south::PEPSMessage, M_west::PEPSMessage, M_north::PEPSMessage
    )
    return @autoopt @tensor begin
        M_west′[DEt; DEb] :=
            conj(ket(A)[d; DNt DEt DSt DWt]) * bra(A)[d; DNb DEb DSb DWb] *
            M_south[DSt; DSb] * M_west[DWt; DWb] * conj(M_north[DNt; DNb])
    end
end

absorb_north_message(A::PEPSTensor, M::PEPSMessage) =
    @tensor A′[d; N' E S W] := A[d; N E S W] * M[N; N']
absorb_east_message(A::PEPSTensor, M::PEPSMessage) =
    @tensor A′[d; N E' S W] := A[d; N E S W] * M[E; E']
absorb_south_message(A::PEPSTensor, M::PEPSMessage) =
    @tensor A′[d; N E S' W] := A[d; N E S W] * conj(M[S; S'])
absorb_west_message(A::PEPSTensor, M::PEPSMessage) =
    @tensor A′[d; N E S W'] := A[d; N E S W] * conj(M[W; W'])

which is the equivalent of defining M_south_new = adjoint(transpose(M_south)) and the same for M_west_new. Unless I'm mistaken, that would lead to the following permutation-free message combination:

MM = M_north * M_south'

What I like about this is that all index orders are still top to bottom and the spaces are still equal for north and south, there are no unneeded permutations, and this is the same kind of notation as thinking about eigenvectors like:

$$\begin{align} A \cdot r &= \lambda \cdot r \\\ A^\dagger \cdot l &= \lambda \cdot l \\\ \implies l^\dagger \cdot A &= \lambda^* \cdot l^\dagger \end{align}$$

For SUWeight, I do want to make its axis order independent of virtual arrow directions. But it is better to deal with it in a follow-up PR

Fully agree to not tackle any of the SU code in this PR, I just wanted to briefly think about it.

A similar problem exists here for fermions when we initialize the messages with identity operators.

This is a very good point. I think this is a great argument in favor of just canonicalizing the arrows before the procedures, and flipping back after. I'll implement this accordingly for BP in this PR already.


By the way, how do you feel about me just restricting the BP implementation for two-layer for now, and not exporting it? I think most of it isn't actually working for PEPO nor partition functions and I'm not even sure yet how to handle these (e.g. hermitian projection does not really make sense)

@leburgel
Copy link
Member

leburgel commented Dec 5, 2025

We probably will have to tinker a bit more with the BeliefPropagation struct in the near future, since there are a bunch of settings missing. Thinking of pseudo-inverse cutoffs, SVD/Eig algorithms, but also of "schemes" for doing the BP iterations. Currently we are updating all messages in parallel, while in principle we could snake or alternate updates which might be more efficient for larger unit cells.

Would it make sense to split the BeliefPropagation struct off from the gauge fixing algorithm, which would then be some other struct (BPGauge, or any other suitable name). The reason would be that BeliefPropagation actually contains all the settings needed to actually run BP and find the fixed point, i.e. the BPEnv (up to the iteration "schemes", these definitely also belong here). But things like pseudo-inverse cutoffs and decomposition algorithms are only needed when using the BPEnv to do an actual gauge fixing, which is a whole extra step on top that we don't always do I think. For example, to do a crude approximate contraction of a network with a BPEnv we only need the current BeliefPropagation struct, without any roots or decompositions.

This would also be compatible with other gauging algorithms, which also require an algorithm subroutine such as BeliefPropagation (e.g. an LBFGS or Arnoldi subroutine).

@Yue-Zhengyuan
Copy link
Member

Yue-Zhengyuan commented Dec 5, 2025

how do you feel about me just restricting the BP implementation for two-layer for now, and not exporting it?

I cannot imagine how to do BP gauging with messages with $n \ne 2$ legs either (though BP itself is fine, so one more reason to separate BP and gauging). So definitely OK to restrict to PEPS-norm networks.

@pbrehmer
Copy link
Collaborator

pbrehmer commented Dec 5, 2025

Would it make sense to split the BeliefPropagation struct off from the gauge fixing algorithm, which would then be some other struct (BPGauge, or any other suitable name).

I think this would in general make sense since there are a few different gauging algorithms which we might want to try out at some point (BP, MCF, maximal fidelity, ...). Would you also think it'd be a good idea to use the same function (gauge_fix) and same abstract algorithm struct (for example GaugeFixingAlgorithm) to also gauge fix the PEPS environments? Because in principle there are also different styles, namely those in the appendix from Bram and Anna's paper and we also need a specialized gauge_fix for C4v environments.

Comment on lines +41 to +48
U, Λ, Vᴴ = svd_compact!(sqrtM * sqrtMᴴ)
if !isdual(space(Mᴴ, 1))
U, Λ′, Vᴴ = flip_svd(U, Λ, Vᴴ)
Λ = transpose(Λ′)
end
sqrtΛ = sdiag_pow(Λ, 1 / 2)
X = isqrtM * U * sqrtΛ
invX = sqrtΛ * Vᴴ * isqrtMᴴ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be safer to flip_svd after X and invX have been obtained using the "canonical" virtual arrow.

Suggested change
U, Λ, Vᴴ = svd_compact!(sqrtM * sqrtMᴴ)
if !isdual(space(Mᴴ, 1))
U, Λ′, Vᴴ = flip_svd(U, Λ, Vᴴ)
Λ = transpose(Λ′)
end
sqrtΛ = sdiag_pow(Λ, 1 / 2)
X = isqrtM * U * sqrtΛ
invX = sqrtΛ * Vᴴ * isqrtMᴴ
U, Λ, Vᴴ = svd_compact!(sqrtM * sqrtMᴴ)
sqrtΛ = sdiag_pow(Λ, 1 / 2)
X = isqrtM * U * sqrtΛ
invX = sqrtΛ * Vᴴ * isqrtMᴴ
if !isdual(space(Mᴴ, 1))
X, Λ, invX = flip_svd(X, Λ, invX)
end

@Yue-Zhengyuan
Copy link
Member

After merging #315 we can try bringing back the BPEnv to SUWeight conversion as part of the gauging process. And possibly include SU-gauging as well (at least for test purposes)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants