BACKGROUND AND OBJECTIVESerum protein electrophoresis (SPEP) plays a critical role in diagnosing diseases associated with M-proteins. However, its clinical application is limited by a heavy reliance on experienced experts.METHODSA dataset comprising 85,026 SPEP outcomes was utilized to develop artificial intelligence diagnostic models for the classification and localization of M-proteins. These models were trained and validated using three data features, and their performance was evaluated using comprehensive metrics, including sensitivity, positive predictive value (PPV), specificity, negative predictive value (NPV), F1 score, accuracy, area under the receiver operating characteristic curve (AUC), Matthews correlation coefficient (MCC), and Intersection over Union (IoU). The best-performing machine learning (ML) and deep learning (DL) models were further tested on a separate dataset of 1,079 samples. The localization ability of the DL model was compared against three clinical experts.RESULTSAmong the four ML models, the extreme gradient boosting (XGB) model achieved the best performance, with MCC, AUC, F1 score, sensitivity, specificity, accuracy, PPV, and NPV of 0.847, 0.903, 0.875, 0.822, 0.985, 0.951, 0.934, and 0.955, respectively. Different feature extraction methods significantly influenced model performance. The DL models outperformed the ML models in comprehensive performance. The U-Net combined with Transformer model demonstrated localization ability comparable to that of clinical experts, achieving sensitivity, specificity, accuracy, PPV, NPV, F1 score, AUC, MCC, and IoU of 0.947, 0.984, 0.976, 0.938, 0.986, 0.942, 0.966, 0.927, and 0.877, respectively.CONCLUSIONThe U-Net combined with the Transformer model demonstrated expert-level performance in M-protein classification and localization, achieving an accuracy of 0.976 and an IoU of 0.877. This exceptional performance highlights the potential of this combined model for automating clinical SPEP workflows.