-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
怎么训练完,效果还不如默认的模型,😓,大佬方便看下有什么问题吗? #12136
Comments
数据集:https://aistudio.baidu.com/datasetdetail/107509 1.划分数据集python PPOCRLabel/gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ./train_data --detRootPath ./train_data/det --recRootPath ./train_data/rec Label.txtocr/0000.jpg [{"transcription": "吴运", "points": [[94, 55], [139, 55], [139, 80], [94, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[42, 84], [218, 84], [218, 111], [42, 111]], "difficult": false, "key_cls": "sex"}, {"transcription": "2006年9月11日", "points": [[88, 116], [245, 116], [245, 140], [88, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "广东省广州某某666号—白制数据集", "points": [[91, 150], [279, 150], [279, 193], [91, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "213984748281422633", "points": [[144, 240], [311, 240], [311, 260], [144, 260]], "difficult": false, "key_cls": "card"}] ocr/0001.jpg [{"transcription": "何汊", "points": [[95, 55], [136, 55], [136, 78], [95, 78]], "difficult": false, "key_cls": "name"}, {"transcription": "性别女民族汉", "points": [[47, 88], [205, 88], [205, 108], [47, 108]], "difficult": false, "key_cls": "sex"}, {"transcription": "1998年5月6日", "points": [[91, 118], [237, 118], [237, 139], [91, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "湖南省长沙某某666号—自制数据集", "points": [[90, 153], [280, 153], [280, 197], [90, 197]], "difficult": false, "key_cls": "adress"}, {"transcription": "081494794611860592", "points": [[145, 235], [310, 235], [310, 260], [145, 260]], "difficult": false, "key_cls": "card"}] ocr/0002.jpg [{"transcription": "雷专鼢", "points": [[94, 54], [157, 54], [157, 81], [94, 81]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[45, 88], [207, 88], [207, 107], [45, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "2018年11月24日", "points": [[90, 117], [239, 117], [239, 141], [90, 141]], "difficult": false, "key_cls": "birth"}, {"transcription": "安徽省合肥某某666号—自制数据集", "points": [[91, 151], [278, 151], [278, 192], [91, 192]], "difficult": false, "key_cls": "adress"}, {"transcription": "965073295312179882", "points": [[145, 238], [310, 238], [310, 258], [145, 258]], "difficult": false, "key_cls": "card"}] ocr/0003.jpg [{"transcription": "卫肮", "points": [[92, 55], [145, 55], [145, 81], [92, 81]], "difficult": false, "key_cls": "name"}, {"transcription": "性别女民族汉", "points": [[45, 88], [207, 88], [207, 106], [45, 106]], "difficult": false, "key_cls": "sex"}, {"transcription": "2019年12月17日", "points": [[88, 116], [238, 116], [238, 137], [88, 137]], "difficult": false, "key_cls": "birth"}, {"transcription": "河南省郑州某某666号—自制数据集", "points": [[90, 153], [277, 153], [277, 192], [90, 192]], "difficult": false, "key_cls": "adress"}, {"transcription": "163878699991251580", "points": [[144, 238], [314, 238], [314, 258], [144, 258]], "difficult": false, "key_cls": "card"}] ocr/0004.jpg [{"transcription": "郝颁", "points": [[93, 56], [144, 56], [144, 81], [93, 81]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[46, 88], [207, 88], [207, 107], [46, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "1984年9月19日", "points": [[90, 115], [237, 115], [237, 139], [90, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "浙江省杭州某某666号—自制数据集", "points": [[90, 154], [278, 154], [278, 193], [90, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "803226740643271224", "points": [[144, 236], [315, 236], [315, 262], [144, 262]], "difficult": false, "key_cls": "card"}] ocr/0005.jpg [{"transcription": "郝嫖", "points": [[93, 54], [144, 54], [144, 82], [93, 82]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[46, 86], [208, 86], [208, 111], [46, 111]], "difficult": false, "key_cls": "sex"}, {"transcription": "1979年7月17日", "points": [[90, 118], [239, 118], [239, 137], [90, 137]], "difficult": false, "key_cls": "birth"}, {"transcription": "湖北省武汉某某666号一自制数据集", "points": [[91, 152], [277, 152], [277, 193], [91, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "974132654656644874", "points": [[144, 237], [311, 237], [311, 260], [144, 260]], "difficult": false, "key_cls": "card"}] ocr/0006.jpg [{"transcription": "茅钇鲅", "points": [[93, 56], [156, 56], [156, 77], [93, 77]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[46, 88], [207, 88], [207, 107], [46, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "1978年2月26日", "points": [[89, 118], [241, 118], [241, 139], [89, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州某某666号—自制数据集", "points": [[89, 152], [279, 152], [279, 196], [89, 196]], "difficult": false, "key_cls": "adress"}, {"transcription": "267124840234334444", "points": [[145, 237], [310, 237], [310, 257], [145, 257]], "difficult": false, "key_cls": "card"}] ocr/0007.jpg [{"transcription": "薛氅沂", "points": [[92, 55], [155, 55], [155, 78], [92, 78]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[45, 87], [207, 87], [207, 106], [45, 106]], "difficult": false, "key_cls": "sex"}, {"transcription": "1995年7月14日", "points": [[88, 118], [236, 118], [236, 138], [88, 138]], "difficult": false, "key_cls": "birth"}, {"transcription": "黑龙江省哈尔滨某某666号自制数据集", "points": [[90, 151], [280, 151], [280, 196], [90, 196]], "difficult": false, "key_cls": "adress"}, {"transcription": "556010053044157446", "points": [[143, 237], [315, 237], [315, 262], [143, 262]], "difficult": false, "key_cls": "card"}] ocr/0008.jpg [{"transcription": "殷蓊", "points": [[93, 53], [144, 53], [144, 82], [93, 82]], "difficult": false, "key_cls": "name"}, {"transcription": "性别女民族汉", "points": [[46, 88], [207, 88], [207, 107], [46, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "2017年8月6日", "points": [[89, 116], [237, 116], [237, 139], [89, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "河北省石家庄某某666号一自制数据集", "points": [[91, 151], [279, 151], [279, 194], [91, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "20190257831503175X", "points": [[145, 237], [314, 237], [314, 260], [145, 260]], "difficult": false, "key_cls": "card"}] ocr/0009.jpg [{"transcription": "熊杼", "points": [[95, 55], [139, 55], [139, 80], [95, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[45, 88], [215, 88], [215, 109], [45, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "2013年4月5日", "points": [[89, 116], [237, 116], [237, 139], [89, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "广东省广州某某666号—自制数据集", "points": [[91, 151], [278, 151], [278, 192], [91, 192]], "difficult": false, "key_cls": "adress"}, {"transcription": "661546442301175555", "points": [[145, 236], [312, 236], [312, 258], [145, 258]], "difficult": false, "key_cls": "card"}] ocr/0010.jpg [{"transcription": "计疆怏", "points": [[97, 56], [160, 56], [160, 80], [97, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[46, 87], [207, 87], [207, 107], [46, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "1985年10月17日", "points": [[89, 117], [240, 117], [240, 142], [89, 142]], "difficult": false, "key_cls": "birth"}, {"transcription": "浙江省杭州某某666号—自制数据集", "points": [[91, 151], [278, 151], [278, 195], [91, 195]], "difficult": false, "key_cls": "adress"}, {"transcription": "449688818089602352", "points": [[145, 235], [313, 235], [313, 261], [145, 261]], "difficult": false, "key_cls": "card"}] ocr/0011.jpg [{"transcription": "萧杀鼬", "points": [[94, 55], [155, 55], [155, 78], [94, 78]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[46, 86], [214, 86], [214, 111], [46, 111]], "difficult": false, "key_cls": "sex"}, {"transcription": "1986年9月3日", "points": [[90, 117], [240, 117], [240, 139], [90, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州某某666号——自制数据集", "points": [[93, 149], [278, 149], [278, 192], [93, 192]], "difficult": true, "key_cls": "adress"}, {"transcription": "984445587021926237", "points": [[146, 236], [311, 236], [311, 259], [146, 259]], "difficult": false, "key_cls": "card"}] ocr/0012.jpg [{"transcription": "乐骶脲", "points": [[95, 55], [158, 55], [158, 79], [95, 79]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[43, 84], [207, 84], [207, 107], [43, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "2014年6月2日", "points": [[90, 117], [228, 117], [228, 135], [90, 135]], "difficult": false, "key_cls": "birth"}, {"transcription": "安徽省合肥某某666号—自制数据集", "points": [[89, 149], [279, 149], [279, 194], [89, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "294336746086848019", "points": [[145, 237], [310, 237], [310, 260], [145, 260]], "difficult": false, "key_cls": "card"}] ocr/0013.jpg [{"transcription": "金桂羔", "points": [[94, 55], [160, 55], [160, 83], [94, 83]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[45, 87], [207, 87], [207, 108], [45, 108]], "difficult": false, "key_cls": "sex"}, {"transcription": "2019年6月9日", "points": [[89, 117], [237, 117], [237, 140], [89, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州某某666号——自制数据集", "points": [[90, 151], [279, 151], [279, 194], [90, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "770140527439237475", "points": [[144, 237], [314, 237], [314, 260], [144, 260]], "difficult": false, "key_cls": "card"}] ocr/0014.jpg [{"transcription": "苗停", "points": [[92, 55], [136, 55], [136, 80], [92, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[42, 85], [207, 85], [207, 107], [42, 107]], "difficult": false, "key_cls": "sex"}, {"transcription": "2004年2月25日", "points": [[89, 115], [229, 115], [229, 135], [89, 135]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州某某666号—自制数据集", "points": [[91, 151], [277, 151], [277, 193], [91, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "352678939043916255", "points": [[144, 238], [311, 238], [311, 262], [144, 262]], "difficult": false, "key_cls": "card"}] ocr/0015.jpg [{"transcription": "安狲", "points": [[93, 54], [141, 54], [141, 78], [93, 78]], "difficult": false, "key_cls": "name"}, {"transcription": "性别机器人民族汉", "points": [[44, 86], [213, 86], [213, 112], [44, 112]], "difficult": false, "key_cls": "sex"}, {"transcription": "1985年9月12日", "points": [[90, 116], [241, 116], [241, 141], [90, 141]], "difficult": false, "key_cls": "birth"}, {"transcription": "山东省济南某某666号——自制数据集", "points": [[90, 151], [279, 151], [279, 193], [90, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "78953491053071603X", "points": [[145, 238], [312, 238], [312, 261], [145, 261]], "difficult": false, "key_cls": "card"}] ocr/0016.jpg [{"transcription": "成最", "points": [[93, 55], [139, 55], [139, 80], [93, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[47, 88], [212, 88], [212, 109], [47, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "2004年10月16日", "points": [[91, 118], [238, 118], [238, 139], [91, 139]], "difficult": false, "key_cls": "birth"}, {"transcription": "潮北省武汉某某666号——自制数据集", "points": [[92, 153], [279, 153], [279, 195], [92, 195]], "difficult": false, "key_cls": "adress"}, {"transcription": "585127964734198547", "points": [[145, 238], [311, 238], [311, 260], [145, 260]], "difficult": false, "key_cls": "card"}] ocr/0017.jpg [{"transcription": "凤靓", "points": [[94, 56], [141, 56], [141, 80], [94, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[47, 88], [211, 88], [211, 109], [47, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "1995年6月27日", "points": [[91, 117], [245, 117], [245, 142], [91, 142]], "difficult": false, "key_cls": "birth"}, {"transcription": "山西省太原某某666号—自制数据集", "points": [[92, 153], [279, 153], [279, 193], [92, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "925371718911532719", "points": [[145, 238], [311, 238], [311, 260], [145, 260]], "difficult": false, "key_cls": "card"}] ocr/0018.jpg [{"transcription": "张辉", "points": [[93, 55], [139, 55], [139, 79], [93, 79]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[45, 87], [214, 87], [214, 109], [45, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "2009年12月6日", "points": [[88, 117], [237, 117], [237, 140], [88, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "潮北省武汉某某666号—自制数据集", "points": [[93, 152], [278, 152], [278, 194], [93, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "898135533101984568", "points": [[145, 238], [310, 238], [310, 258], [145, 258]], "difficult": false, "key_cls": "card"}] ocr/0019.jpg [{"transcription": "吕课", "points": [[95, 57], [137, 57], [137, 79], [95, 79]], "difficult": false, "key_cls": "name"}, {"transcription": "性别未知民族汉", "points": [[46, 87], [210, 87], [210, 110], [46, 110]], "difficult": false, "key_cls": "sex"}, {"transcription": "2014年1月16日", "points": [[91, 116], [240, 116], [240, 140], [91, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "山东省济南某某666号—自制数据集", "points": [[92, 151], [280, 151], [280, 194], [92, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "418547080384305138", "points": [[146, 235], [310, 235], [310, 260], [146, 260]], "difficult": false, "key_cls": "card"}] ocr/0020.jpg [{"transcription": "祁峁", "points": [[94, 55], [135, 55], [135, 79], [94, 79]], "difficult": false, "key_cls": "name"}, {"transcription": "性别女民族汉", "points": [[46, 88], [209, 88], [209, 111], [46, 111]], "difficult": false, "key_cls": "sex"}, {"transcription": "2002年3月20日", "points": [[90, 116], [238, 116], [238, 140], [90, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "河南省郑州某某666号——自制数据集", "points": [[92, 151], [277, 151], [277, 193], [92, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "081155252519463497", "points": [[144, 237], [312, 237], [312, 260], [144, 260]], "difficult": false, "key_cls": "card"}] ocr/0021.jpg [{"transcription": "魏喜糨", "points": [[96, 55], [154, 55], [154, 77], [96, 77]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[49, 89], [209, 89], [209, 109], [49, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "2005年7月15日", "points": [[91, 117], [237, 117], [237, 140], [91, 140]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州某某666号—自制数据集", "points": [[93, 152], [279, 152], [279, 195], [93, 195]], "difficult": false, "key_cls": "adress"}, {"transcription": "885898805473809799", "points": [[145, 238], [310, 238], [310, 257], [145, 257]], "difficult": false, "key_cls": "card"}] ocr/0022.jpg [{"transcription": "425763470037534625", "points": [[146, 239], [310, 239], [310, 258], [146, 258]], "difficult": false, "key_cls": "card"}, {"transcription": "吉林省长春某某666号——自制数据集", "points": [[93, 153], [280, 153], [280, 194], [93, 194]], "difficult": false, "key_cls": "adress"}, {"transcription": "1986年2月13日", "points": [[93, 117], [237, 117], [237, 138], [93, 138]], "difficult": false, "key_cls": "birth"}, {"transcription": "性别男民族汉", "points": [[47, 88], [207, 88], [207, 109], [47, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "傅嘁馐", "points": [[95, 57], [155, 57], [155, 78], [95, 78]], "difficult": false, "key_cls": "name"}] ocr/0023.jpg [{"transcription": "孔操肆", "points": [[97, 57], [155, 57], [155, 80], [97, 80]], "difficult": false, "key_cls": "name"}, {"transcription": "性别男民族汉", "points": [[45, 85], [207, 85], [207, 109], [45, 109]], "difficult": false, "key_cls": "sex"}, {"transcription": "2006年2月21日", "points": [[90, 116], [232, 116], [232, 136], [90, 136]], "difficult": false, "key_cls": "birth"}, {"transcription": "福建省福州号——自制数据集", "points": [[93, 152], [278, 152], [278, 193], [93, 193]], "difficult": false, "key_cls": "adress"}, {"transcription": "760297342529942985", "points": [[146, 239], [309, 239], [309, 257], [146, 257]], "difficult": false, "key_cls": "card"}] 2.训练det模型python tools/train.py -c configs/det/ch_PP-OCRv4/ch_PP-OCRv4_det_cml.yml cal_metric_during_train改成false,Train.loader.batch_size_per_card改成10,Global.pretrained_model改成本地模型 Global:
debug: false
use_gpu: true
epoch_num: 500
log_smooth_window: 20
print_batch_step: 20
save_model_dir: ./output/ch_PP-OCRv4_det
save_epoch_step: 50
eval_batch_step:
- 0
- 1000
cal_metric_during_train: false
checkpoints: null
pretrained_model: ./train/model/ch_PP-OCRv4_det_train/best_accuracy.pdparams
save_inference_dir: null
use_visualdl: false
infer_img: doc/imgs_en/img_10.jpg
save_res_path: ./checkpoints/det_db/predicts_db.txt
distributed: true
Architecture:
name: DistillationModel
algorithm: Distillation
model_type: det
Models:
Student:
model_type: det
algorithm: DB
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.75
pretrained: false
det: true
Neck:
name: RSEFPN
out_channels: 96
shortcut: true
Head:
name: DBHead
k: 50
Student2:
pretrained: null
model_type: det
algorithm: DB
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.75
pretrained: true
det: true
Neck:
name: RSEFPN
out_channels: 96
shortcut: true
Head:
name: DBHead
k: 50
Teacher:
pretrained: https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_cml_teacher_pretrained/teacher.pdparams
freeze_params: true
return_all_feats: false
model_type: det
algorithm: DB
Backbone:
name: ResNet_vd
in_channels: 3
layers: 50
Neck:
name: LKPAN
out_channels: 256
Head:
name: DBHead
kernel_list:
- 7
- 2
- 2
k: 50
Loss:
name: CombinedLoss
loss_config_list:
- DistillationDilaDBLoss:
weight: 1.0
model_name_pairs:
- - Student
- Teacher
- - Student2
- Teacher
key: maps
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
- DistillationDMLLoss:
model_name_pairs:
- Student
- Student2
maps_name: thrink_maps
weight: 1.0
key: maps
- DistillationDBLoss:
weight: 1.0
model_name_list:
- Student
- Student2
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 2
regularizer:
name: L2
factor: 5.0e-05
PostProcess:
name: DistillationDBPostProcess
model_name:
- Student
key: head_out
thresh: 0.3
box_thresh: 0.6
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DistillationMetric
base_metric_name: DetMetric
main_indicator: hmean
key: Student
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/
label_file_list:
- ./train_data/det/train.txt
ratio_list: [1.0]
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- IaaAugment:
augmenter_args:
- type: Fliplr
args:
p: 0.5
- type: Affine
args:
rotate:
- -10
- 10
- type: Resize
args:
size:
- 0.5
- 3
- EastRandomCropData:
size:
- 640
- 640
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
total_epoch: 500
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
total_epoch: 500
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- threshold_map
- threshold_mask
- shrink_map
- shrink_mask
loader:
shuffle: true
drop_last: false
# batch_size_per_card: 16
batch_size_per_card: 10
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/
label_file_list:
- ./train_data/det/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- DetLabelEncode: null
- DetResizeForTest:
limit_side_len: 960
limit_type: max
- NormalizeImage:
scale: 1./255.
mean:
- 0.485
- 0.456
- 0.406
std:
- 0.229
- 0.224
- 0.225
order: hwc
- ToCHWImage: null
- KeepKeys:
keep_keys:
- image
- shape
- polys
- ignore_tags
loader:
shuffle: false
drop_last: false
batch_size_per_card: 1
num_workers: 2
profiler_options: null 3.训练rec模型python tools/train.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec.yml Global.pretrained_model改成本地模型 Global:
debug: false
use_gpu: true
epoch_num: 200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/ch_PP-OCRv4_rec
save_epoch_step: 10
eval_batch_step: [0, 2000]
cal_metric_during_train: true
pretrained_model: ./train/model/ch_PP-OCRv4_rec_train/student.pdparams
checkpoints:
save_inference_dir:
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
infer_mode: false
use_space_char: true
distributed: true
save_res_path: ./output/rec/predicts_ppocrv3.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.001
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform:
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [1, 3]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: *max_text_length
Loss:
name: MultiLoss
loss_config_list:
- CTCLoss:
- NRTRLoss:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: ./train_data/
ext_op_transform_idx: 1
label_file_list:
- ./train_data/rec/train.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- RecConAug:
prob: 0.5
ext_data_num: 2
image_shape: [48, 320, 3]
max_text_length: *max_text_length
- RecAug:
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
sampler:
name: MultiScaleSampler
scales: [[320, 32], [320, 48], [320, 64]]
first_bs: &bs 192
fix_bs: false
divided_factor: [8, 16] # w, h
is_training: True
loader:
shuffle: true
batch_size_per_card: *bs
drop_last: true
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data
label_file_list:
- ./train_data/rec/val.txt
transforms:
- DecodeImage:
img_mode: BGR
channel_first: false
- MultiLabelEncode:
gtc_encode: NRTRLabelEncode
- RecResizeImg:
image_shape: [3, 48, 320]
- KeepKeys:
keep_keys:
- image
- label_ctc
- label_gtc
- length
- valid_ratio
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128
num_workers: 4 4.导出推理模型python tools/export_model.py -c configs/det/ch_PP-OCRv4/ch_PP-OCRv4_det_cml.yml -o Global.pretrained_model=output/ch_PP-OCRv4_det/best_model/model.pdparams Global.save_inference_dir=inference_model/det/ python tools/export_model.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec.yml -o Global.pretrained_model="./output/ch_PP-OCRv4_rec/latest.pdparams" Global.save_inference_dir="./inference_model/rec/" 5.执行脚本import cv2
from paddleocr import PaddleOCR
paddleocr = PaddleOCR(lang='ch', show_log=False, enable_mkldnn=True,
det_model_dir='./inference_model/det/Teacher',
rec_model_dir='./inference_model/rec'
)
img = cv2.imread('./train_data/0000.jpg')
result = paddleocr.ocr(img)
for i in range(len(result[0])):
print(result[0][i][1][0]) |
数据量太少了 |
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
稍等附上训练过程
The text was updated successfully, but these errors were encountered: