Skip to content

大佬请教一下模型训练问题 #157

@MrWangChong

Description

@MrWangChong

Epoch: 10/10, Batch:0/1, Loss: 0.0000: 100%|██████████████████████████████████████████████████████████████████████| 1/1 [01:28<00:00, 88.81s/it]
Running Evaluation: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:14<00:00, 14.68s/it]
2024-12-30 15:08:37.588 | DEBUG | text2vec.sentence_model:evaluate:277 - labels: [0, 1, 1, 0, 1, 1, 1, 1, 0, 0]███| 1/1 [00:14<00:00, 14.68s/it]
2024-12-30 15:08:37.600 | DEBUG | text2vec.sentence_model:evaluate:278 - preds: [0.02931239, 0.8442257, 0.9048866, -0.10592551, 0.9284405, 0.93183136, 0.8627026, 0.8772242, -0.050434127, -0.073510975]
2024-12-30 15:08:37.600 | DEBUG | text2vec.sentence_model:evaluate:279 - pearson: 0.9919297894994221, spearman: 0.8432740427115677
2024-12-30 15:08:37.620 | INFO | text2vec.sentence_model:eval_model:231 - {'eval_spearman': 0.8432740427115677, 'eval_pearson': 0.9919297894994221}

训练的时候看着都是正常的,如上面的数据里面的labels和preds,训练评估的数据也没问题。但是训练完成的评估就突然不正常了,如下面的labels和preds,看起来和训练的效果完全不一致,请问一下大佬这是怎么回事?

Epoch: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [17:14<00:00, 103.49s/it]
2024-12-30 15:08:37.740 | INFO | text2vec.cosent_model:train_model:130 - Training model done. Saved to ./outputs/STS-B-model.
2024-12-30 15:08:37.745 | INFO | main:main:87 - Model saved to ./outputs/STS-B-model
2024-12-30 15:08:39.283 | DEBUG | text2vec.sentence_model:init:78 - Use pytorch device: cpu
2024-12-30 15:08:39.313 | DEBUG | main:main:118 - ('A节主变风机出现比较大的异常声音。', 'A节主变水机出现比较大的异常震动。', 0)
2024-12-30 15:08:40.095 | DEBUG | main:main:120 - <class 'numpy.ndarray'>, (26, 768), (768,)
2024-12-30 15:08:41.805 | DEBUG | main:calc_similarity_scores:26 - labels: [0 1 1 0 1 1 1 1 0 0]
2024-12-30 15:08:41.810 | DEBUG | main:calc_similarity_scores:27 - preds: [0.8464496 0.8778059 0.953287 0.75261563 0.9444511 0.91630775 0.94660693 0.8541326 0.7975155 0.7461488 ]
2024-12-30 15:08:41.810 | DEBUG | main:calc_similarity_scores:28 - Spearman: 0.8432740427115677
2024-12-30 15:08:41.810 | DEBUG | main:calc_similarity_scores:29 - spend time: 1.6849, count:52, qps: 30.86247843307947

训练命令就是最原始的:python training_sup_text_matching_model_mydata.py --do_train --do_predict
修改了:save_model_every_epoch: bool = False
训练、验证、测试 数据集都是完全相同的三份我手动输入的数据,数据试过0和1,1-5。都是一样的结果。
也试过训练其它类型的模型,都遇到了这个问题,在训练的时候labels和preds没问题,完成之后就不行了。

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions