Research on Intelligent Generation Algorithm of Interface Icon Based on Diffusion Model

发布时间:2026-03-21 19:57:01 人气:5


Research on Intelligent Generation Algorithm of Interface Icon Based on Diffusion Model

Authors

DOI: 

https://doi.org/10.71451/ISTAER2607

Keywords: 

Diffusion model; Interface icon generation; Multimodal conditional control; Structure perception; Style consistency

Abstract

To address the problems in interface icon generation, such as a lack of structural expression, difficulty in maintaining style consistency, and limited capability for multi-condition generation, this paper proposes a structure-aware intelligent icon generation method named IconDiff, which is based on a diffusion model. Based on the classical diffusion framework, this method introduces a structure-guided branching mechanism and a multimodal condition fusion mechanism to achieve collaborative modeling of text semantics, style features, and attribute information. It also enhances boundary clarity and semantic identifiability by designing an icon-specific loss function. At the same time, a multidimensional annotation data set containing 268000 icon samples is constructed, and a special evaluation index system for icon tasks is designed. Under a unified experimental setup, compared with various mainstream generation methods, the proposed method reduces the FID by approximately 25.2%, improves structural clarity by about 6.0%, enhances identifiability by about 6.8%, and increases style consistency by about 7.8%. In addition, ablation experiments verify the effectiveness of the key modules. Generalization and robustness analysis show that the model maintains stable performance even in the absence of semantic and style conditions. The research results show that the method in this paper has significantly improved the generation quality and controllability, and provides an effective solution for the automatic design of interface icons.

References

[1] Petković, G., Pasanec Preprotić, S., & Kozjan Cindrić, A. (2025). Experiential Graphic Design: Informing, Inspiring, and Integrating People in Physical Spaces—A Review. Buildings, 15(11), 1862. DOI: https://doi.org/10.3390/buildings15111862

[2] Zhao, Y., Liang, Z., Qiu, Y., & Wang, X. (2025). A novel flexible identity-net with diffusion models for painting-style generation. Scientific Reports, 15(1), 27896. DOI: https://doi.org/10.1038/s41598-025-12434-4

[3] Jiang, S., Wu, M., Lai, Z., & Pu, Q. (2025). Mapping with a sense of place: a crowdsourced image-based color generation approach. Cartography and Geographic Information Science, 1-21. DOI: https://doi.org/10.1080/15230406.2025.2580432

[4] Eswaran, U., & Eswaran, V. (2025). AI-driven cross-platform design: Enhancing usability and user experience. In Navigating usability and user experience in a multi-platform world (pp. 19-48). IGI Global. DOI: https://doi.org/10.4018/979-8-3693-2337-3.ch002

[5] Yuzhao, Z. (2025). Research on Cross-Platform Data Fusion and Intelligent Analysis Methods for Online Communication. International Journal of High Speed Electronics and Systems, 2540876. DOI: https://doi.org/10.1142/S0129156425408769

[6] Collaud, R., Reppa, I., Défayes, L., McDougall, S., Henchoz, N., & Sonderegger, A. (2022). Design standards for icons: The independent role of aesthetics, visual complexity and concreteness in icon design and icon understanding. Displays, 74, 102290. DOI: https://doi.org/10.1016/j.displa.2022.102290

[7] Zhou, Y., Leng, H., Meng, S., Wu, H., & Zhang, Z. (2024). StructDiffusion: End-to-end intelligent shear wall structure layout generation and analysis using diffusion model. Engineering Structures, 309, 118068. DOI: https://doi.org/10.1016/j.engstruct.2024.118068

[8] Leng, H., Gao, Y., & Zhou, Y. (2024). ArchiDiffusion: A novel diffusion model connecting architectural layout generation from sketches to Shear Wall Design. Journal of Building Engineering, 98, 111373. DOI: https://doi.org/10.1016/j.jobe.2024.111373

[9] Po, R., Yifan, W., Golyanik, V., Aberman, K., Barron, J. T., Bermano, A., ... & Wetzstein, G. (2024, May). State of the art on diffusion models for visual computing. In Computer graphics forum (Vol. 43, No. 2, p. e15063). DOI: https://doi.org/10.1111/cgf.15063

[10] Wang, B., Chen, Q., & Wang, Z. (2025). Diffusion-based visual art creation: A survey and new perspectives. ACM Computing Surveys, 57(10), 1-37. DOI: https://doi.org/10.1145/3728459

[11] Amador-Domínguez, E., Serrano, E., & Manrique, D. (2024). Neurosymbolic system profiling: A template-based approach. Knowledge-Based Systems, 287, 111441. DOI: https://doi.org/10.1016/j.knosys.2024.111441

[12] Yu, S., Fang, C., Tuo, Z., Zhang, Q., Chen, C., Chen, Z., & Su, Z. (2025). Vision-based mobile app gui testing: A survey. ACM Computing Surveys, 58(6), 1-46. DOI: https://doi.org/10.1145/3773027

[13] França, R. P., Monteiro, A. C. B., Arthur, R., & Iano, Y. (2021). An overview of deep learning in big data, image, and signal processing in the modern digital age. Trends in deep learning methodologies, 63-87. DOI: https://doi.org/10.1016/B978-0-12-822226-3.00003-9

[14] Zhang, X., & Jia, Y. (2023). Fractal Art Graphic Generation Based on Deep Learning Driven Intelligence. Computer-Aided Design and Applications, 152-165. DOI: https://doi.org/10.14733/cadaps.2024.S3.152-165

[15] Wang, S., Du, Y., Guo, X., Pan, B., Qin, Z., & Zhao, L. (2024). Controllable data generation by deep learning: A review. ACM Computing Surveys, 56(9), 1-38. DOI: https://doi.org/10.1145/3648609

[16] Li, J., Yang, J., Zhang, J., Liu, C., Wang, C., & Xu, T. (2020). Attribute-conditioned layout gan for automatic graphic design. IEEE Transactions on Visualization and Computer Graphics, 27(10), 4039-4048. DOI: https://doi.org/10.1109/TVCG.2020.2999335

[17] Silva‐Silverio, A., Gómez‐Gil, P., & Sánchez‐Argüelles, D. O. (2025). Conditional GAN Approaches on Regression Labels: A State‐of‐the‐Art Review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 15(4), e70050. DOI: https://doi.org/10.1002/widm.70050

[18] Wołczyk, M., Proszewska, M., Maziarka, Ł., Zieba, M., Wielopolski, P., Kurczab, R., & Smieja, M. (2022, June). Plugen: Multi-label conditional generation from pre-trained models. In Proceedings of the AAAI conference on artificial intelligence (Vol. 36, No. 8, pp. 8647-8656). DOI: https://doi.org/10.1109/TPAMI.2024.3382008

[19] Ma, H., & Wong, H. C. (2026). A Survey of Diffusion Models: Methods and Applications. Applied Sciences, 16(5), 2482. DOI: https://doi.org/10.3390/app16052482

[20] Croitoru, F. A., Hondru, V., Ionescu, R. T., & Shah, M. (2023). Diffusion models in vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 45(9), 10850-10869. DOI: https://doi.org/10.1109/TPAMI.2023.3261988

[21] Luo, J., Yang, L., Liu, Y., Hu, C., Wang, G., Yang, Y., ... & Zhou, X. (2025). Review of diffusion models and its applications in biomedical informatics. BMC Medical Informatics and Decision Making, 25(1), 390. DOI: https://doi.org/10.1186/s12911-025-03210-5

[22] Wu, T., Li, M., Chen, J., Ji, W., Lin, W., Gao, J., ... & Wu, F. (2024, October). Semantic alignment for multimodal large language models. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 3489-3498). DOI: https://doi.org/10.1145/3664647.3681014

[23] Peng, Y. (2025). A CLIP-based cross-modal matching model for image-text retrieval. Information Technology and Control, 54(3), 1030-1048. DOI: https://doi.org/10.5755/j01.itc.54.3.41801

[24] Peng, F., Yang, X., Xiao, L., Wang, Y., & Xu, C. (2023). Sgva-clip: Semantic-guided visual adapting of vision-language models for few-shot image classification. IEEE Transactions on Multimedia, 26, 3469-3480. DOI: https://doi.org/10.1109/TMM.2023.3311646

[25] Huang, Q., & Huang, J. (2025). Comprehensive review of edge and contour detection: from traditional methods to recent advances. Neural Computing and Applications, 37(4), 2175-2209. DOI: https://doi.org/10.1007/s00521-024-10936-2

[26] Chen, Z., Zhou, H., Lai, J., Yang, L., & Xie, X. (2020). Contour-aware loss: Boundary-aware learning for salient object segmentation. IEEE Transactions on Image Processing, 30, 431-443. DOI: https://doi.org/10.1109/TIP.2020.3037536

[27] Wang, J., Zhou, C., & Huang, Y. (2025). Contour-aware multi-expert model for ambiguous medical image segmentation. IEEE Transactions on Medical Imaging. DOI: https://doi.org/10.1109/TMI.2025.3561117

[28] Ma, S., Li, X., Tang, J., & Guo, F. (2024). Aggregate-aware model with bidirectional edge generation for medical image segmentation. Applied Soft Computing, 163, 111918. DOI: https://doi.org/10.1016/j.asoc.2024.111918

[29] Jiang, H., Imran, M., Zhang, T., Zhou, Y., Liang, M., Gong, K., & Shao, W. (2025). Fast-DDPM: Fast denoising diffusion probabilistic models for medical image-to-image generation. IEEE Journal of Biomedical and Health Informatics. DOI: https://doi.org/10.1109/JBHI.2025.3565183

[30] Zhang, H., Yuan, J., Tian, X., & Ma, J. (2021). GAN-FM: Infrared and visible image fusion using GAN with full-scale skip connection and dual Markovian discriminators. IEEE Transactions on Computational Imaging, 7, 1134-1147. DOI: https://doi.org/10.1109/TCI.2021.3119954

[31] Ran, X., Xi, Y., Lu, Y., Wang, X., & Lu, Z. (2023). Comprehensive survey on hierarchical clustering algorithms and the recent developments. Artificial Intelligence Review, 56(8), 8219-8264. DOI: https://doi.org/10.1007/s10462-022-10366-3

Downloads

Published

2026-03-21

Data Availability Statement

The data that support the findings of this study are available upon request from the corresponding authors, L.L.

Issue

Vol. 4 No. 1 (2026): Volume. 4, No. 1 (March 2026)

Section

Research Article

License

Copyright (c) 2026 International Scientific Technical and Economic Research

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

How to Cite

Liu, L. (2026). Research on Intelligent Generation Algorithm of Interface Icon Based on Diffusion Model. International Scientific Technical and Economic Research 4(1), 149-167. https://doi.org/10.71451/ISTAER2607
Crossref 0Scopus
0Google ScholarEurope PMC
0

Similar Articles


1-10 of 102 Next


You may also start an advanced similarity search for this article.

Latest publications

  • Atom logo

  • RSS2 logo

  • RSS1 logo

Language

Information

International Scientific Technical and Economic Research is a journal of Sichuan Knowledgeable Intelligent Sciences.

ISSN:2959-1309

editorial@istaer.online

Google Scholor  ResearchGate  Semantic Scholar  Scilct  Russian Science Citation Index Crossref DOI R Discovery

  X (FORMERLY TWITTER)     Zhihu

International Scientific Technical and Economic Research is licensed under a Creative Commons Attribution 4.0 International License.


XML地图 | 联系我们
Copyright © 2023 四川博新智数科技研究院 All Rights Reserved.
蜀ICP备2024074801号-1 电话:400-827-9521 信箱:ISTAER@126.com