Add P-Tuning LSTM experiment with 50 virtual tokens to MetaMathQA benchmark by Akashsinghbhadoriya · Pull Request #3356 · huggingface/peft

Akashsinghbhadoriya · 2026-06-23T19:29:21Z

Description

Added a P-Tuning experiment for MetaMathQA benchmark as discussed in #2310.

P-tuning uses a prompt encoder (LSTM or MLP) to generate virtual tokens prepended to the input. This experiment tests the LSTM variant (encoder_reparameterization_type=LSTM) with 50 virtual tokens, complementing the existing MLP-based experiment.

Changes

Added method_comparison/MetaMathQA/experiments/ptuning/llama-3.2-3B-vt50-LSTM/adapter_config.json
Added method_comparison/MetaMathQA/results/ptuning--llama-3.2-3B-vt50-LSTM.json

Results

Experiment was run on NVIDIA RTX 4090 (48GB) using default training params.

Metric	P-tuning LSTM (vt=50)	P-tuning MLP (vt=20)
Test accuracy	0.3495	0.3821
Train time	1584.99s	959.73s
Memory max	27,438 MB	19,980 MB
Trainable params	434,371,584	28,382,208
File size	0.59 MB	0.24 MB
Virtual tokens	50	20
Encoder type	LSTM	MLP

Akashsinghbhadoriya · 2026-06-24T06:20:45Z

@BenjaminBossan can you review the PR

BenjaminBossan · 2026-06-24T09:41:48Z

Thanks for working on this P-Tuning experiment. It looks like the results are worse and it requires more memory compared to the existing default settings. Do you have the opportunity to run more experiments to see if you can better results? Some possible further hyper-parameters to test would be learning rate and num_virtual_tokens.

Akashsinghbhadoriya · 2026-06-24T09:55:13Z

Thanks for working on this P-Tuning experiment. It looks like the results are worse and it requires more memory compared to the existing default settings. Do you have the opportunity to run more experiments to see if you can better results? Some possible further hyper-parameters to test would be learning rate and num_virtual_tokens.

I tried changing the num_virtual_tokens increased it from 20 to 50 also used LSTM as an encoder instead of MLP. I ran only this experiment do you have any suggestions should i decrease the num_virtual_tokens to 20 and test for LSTM or any suitable learning rate. The memory usage increased because of the increase in virtual tokens.

BenjaminBossan · 2026-06-24T10:04:15Z

The idea when trying to optimize hyper-parameters is to try different combinations to see what works best. So in this case, you could try LSTM and MLP with different num_virtual_tokens, as well as changing the learning rate. If you see that a specific parameter leads to an improvement (say, increasing the learning rate), you could try changing that parameter even more in that direction to check if there is more of an improvement.

Akashsinghbhadoriya · 2026-06-25T14:42:28Z

The idea when trying to optimize hyper-parameters is to try different combinations to see what works best. So in this case, you could try LSTM and MLP with different num_virtual_tokens, as well as changing the learning rate. If you see that a specific parameter leads to an improvement (say, increasing the learning rate), you could try changing that parameter even more in that direction to check if there is more of an improvement.

Metric	MLP vt=20 lr=1e-4	MLP vt=20 lr=5e-4	MLP vt=50 lr=1e-4	LSTM vt=50 lr=1e-4	LSTM vt=20 lr=1e-4
Test accuracy	0.3821	0.3525	0.3419	0.3495	0.3381
Train time (s)	959.73	928.15	1010.87	1584.99	1322.59
Memory max (MB)	19,989	19,953	21,181	27,438	27,449
Trainable params	28,382,208	28,382,208	28,474,368	434,371,584	434,279,424
File size (MB)	0.23	0.23	0.59	0.59	0.23
Encoder	MLP	MLP	MLP	LSTM	LSTM
Virtual tokens	20	20	50	50	20
LR	1e-4	5e-4	1e-4	1e-4	1e-4

The default config is the one which is performing best as of now tried running different experiments.

BenjaminBossan · 2026-06-29T09:15:13Z

Thanks for reporting these new experiments. When you tried vt=50, did you also check lower and higher learning rates?

Akashsinghbhadoriya · 2026-06-30T06:05:27Z

Thanks for reporting these new experiments. When you tried vt=50, did you also check lower and higher learning rates?

No for vt=50, I only used the default learning rate.

BenjaminBossan · 2026-06-30T09:14:15Z

Is it something that you could give a try?

Akashsinghbhadoriya · 2026-06-30T09:19:25Z

Is it something that you could give a try?

yeah sure i guess it will be better to try it with MLP as an encoder instead of LSTM since MLP seems to be giving better results than LSTM. What do you suggest?

BenjaminBossan · 2026-06-30T09:35:21Z

yeah sure i guess it will be better to try it with MLP as an encoder instead of LSTM since MLP seems to be giving better results than LSTM

I agree.

What do you suggest?

I would vary the vt (let's start with 50) and then check if either increasing or decreasing the learning rate helps. If one of them does, try increasing/decreasing even more, until there is no more improvement. Ideally, this way you can find a setting that beats the current default.

Akashsinghbhadoriya · 2026-07-01T07:36:13Z

yeah sure i guess it will be better to try it with MLP as an encoder instead of LSTM since MLP seems to be giving better results than LSTM

I agree.

What do you suggest?

I would vary the vt (let's start with 50) and then check if either increasing or decreasing the learning rate helps. If one of them does, try increasing/decreasing even more, until there is no more improvement. Ideally, this way you can find a setting that beats the current default.

Metric	MLP vt=20 lr=1e-4 (default)	MLP vt=20 lr=5e-4	MLP vt=50 lr=5e-5	MLP vt=50 lr=1e-4	MLP vt=50 lr=1e-3	MLP vt=50 lr=5e-3
Test accuracy	0.3821	0.3525	0.3055	0.3419	0.3669	0.3548
Train time (s)	959.73	928.15	1113.60	1010.87	1036.85	1029.78
Memory max (MB)	19,989	19,953	21,181	21,181	21,181	21,181
LR	1e-4	5e-4	5e-5	1e-4	1e-3	5e-3
Virtual tokens	20	20	50	50	50	50

for vt=50 their is no improvement from the default either we increase or decrease the learning rate. It would be better if we change the vt decrease them and test it since 50 vt is not improving the results we can try vt range between 20-50 and then try different learning rates

BenjaminBossan · 2026-07-01T09:27:04Z

Thanks a lot for running these tests. Interesting that higher vt doesn't seem to help at all.

It would be better if we change the vt decrease them

If you could try that, it would be great. Starting with 30 would be a good number IMO. Maybe it's also worth trying to decrease vt below 20, like 10 just to give it a try.

Akashsinghbhadoriya · 2026-07-01T16:49:07Z

If you could try that, it would be great. Starting with 30 would be a good number IMO. Maybe it's also worth trying to decrease vt below 20, like 10 just to give it a try.

Metric	MLP vt=10 lr=1e-4	MLP vt=20 lr=1e-4 (default)	MLP vt=30 lr=1e-4
Test accuracy	0.3313	0.3821	0.3419
Train time (s)	1028.57	959.73	1194.91
Memory max (MB)	19,893	19,989	20,470
Trainable params	28,351,488	28,382,208	28,412,928
File size (MB)	0.12	0.23	0.35
Virtual tokens	10	20	30

vt 20 is the best config as of now. Let me know if anything else other than p-tuning i can take up.

Akashsinghbhadoriya added 3 commits June 23, 2026 19:53

p-tuning with LSTM and 50 virtual tokens

2d9c67d

p-tuning with LSTM and 50 virtual tokens

a9dc7f6

Merge remote-tracking branch 'upstream/main'

f9c84af

Akashsinghbhadoriya added 2 commits June 25, 2026 18:26

p-tuning with LSTM and 20 virtual tokens

0fecbcd

p-tuning with LSTM and 20 virtual tokens lr 5e-4

5cfde7b

Akashsinghbhadoriya force-pushed the main branch from bd8fca6 to 1b09b2c Compare June 25, 2026 13:40

p-tuning with MLP and 20 virtual tokens lr 5e-4

f8d0501

Akashsinghbhadoriya force-pushed the main branch from 1b09b2c to f8d0501 Compare June 25, 2026 13:41

p-tuning with MLP and 50 virtual tokens

ed69ccd

Akashsinghbhadoriya and others added 4 commits July 1, 2026 11:17

p-tuning with MLP 50 vt and 5e-5 lr

ab7f799

Merge branch 'huggingface:main' into main

9a7f8da

p-tuning with MLP 50 vt and 1e-3 lr

02973be

p-tuning with MLP 50 vt and 5e-3 lr

e390c22

Akashsinghbhadoriya and others added 3 commits July 1, 2026 15:44

p-tuning with MLP vt 30

d81997c

Merge branch 'huggingface:main' into main

0fba0c3

p-tuning with MLP vt 10

36b70a3

Uh oh!

Conversation

Akashsinghbhadoriya commented Jun 23, 2026

Description

Changes

Results

Results

Uh oh!

Akashsinghbhadoriya commented Jun 24, 2026

Uh oh!

BenjaminBossan commented Jun 24, 2026

Uh oh!

Akashsinghbhadoriya commented Jun 24, 2026

Uh oh!

BenjaminBossan commented Jun 24, 2026

Uh oh!

Akashsinghbhadoriya commented Jun 25, 2026

Uh oh!

BenjaminBossan commented Jun 29, 2026

Uh oh!

Akashsinghbhadoriya commented Jun 30, 2026

Uh oh!

BenjaminBossan commented Jun 30, 2026

Uh oh!

Akashsinghbhadoriya commented Jun 30, 2026

Uh oh!

BenjaminBossan commented Jun 30, 2026

Uh oh!

Akashsinghbhadoriya commented Jul 1, 2026

Uh oh!

BenjaminBossan commented Jul 1, 2026

Uh oh!

Akashsinghbhadoriya commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants