Analisis Perbandingan Akurasi antara ChatGPT dan Blackbox AI dalam Mendeteksi Gambar Mockup

Authors

  • Devi Fajar Wati Universitas Horizon Indonesia
  • Lila Setiyani Horizon University Indonesia
  • Deden Moh Alfiansyah Horizon University Indonesia
  • Muhammad Jembar Jomantara Horizon University Indonesia
  • Dedih Universitas Horizon Indonesia

DOI:

https://doi.org/10.36805/j1574z11

Keywords:

Kecerdasan Buatan, Deteksi Gambar Mockup, ChatGPT, Blackbox AI, Analisis Visual

Abstract

Perkembangan kecerdasan buatan (AI) dalam pengenalan gambar telah membuka peluang baru dalam memahami dan menganalisis elemen visual, termasuk gambar mockup yang sering digunakan dalam desain UI/UX dan branding. Penelitian ini membandingkan akurasi dua sistem AI—ChatGPT dengan kemampuan multimodal dan Blackbox AI proprietary—dalam mendeteksi serta mendeskripsikan gambar mockup. Dengan menggunakan 150 gambar mockup dari berbagai kategori, seperti aplikasi mobile, situs web, dan materi promosi, kami mengevaluasi kemampuan kedua AI berdasarkan akurasi deteksi objek, pemahaman konteks, kualitas deskripsi, dan konsistensi hasil. Hasil penelitian menunjukkan bahwa ChatGPT secara signifikan lebih unggul dalam memahami konteks visual dan menghasilkan deskripsi yang relevan dibandingkan Blackbox AI. Temuan ini memperkuat pentingnya kemampuan reasoning semantik dalam tugas-tugas analisis visual dan memberikan kontribusi terhadap pengembangan AI yang lebih adaptif dan kontekstual dalam bidang desain digital.

 

References

[1] D. Durgam, N. Anandhan, and R. Pathak, “AI Image Generation: Emerging Trends and Its Impact on UI/UX Design.”

[2] H. Dave, S. Sonje, J. Pardeshi, S. Chaudhari, and P. Raundale, “A survey on Artificial Intelligence based techniques to convert User Interface design mock-ups to code,” in Proceedings - International Conference on Artificial Intelligence and Smart Systems, ICAIS 2021, Institute of Electrical and Electronics Engineers Inc., Mar. 2021, pp. 28–33. doi: 10.1109/ICAIS50930.2021.9395994.

[3] M. Samir, A. Elsayed, and M. I. Marie, “A Model for Automatic Code Generation from High Fidelity Graphical User Interface Mockups using Deep Learning Techniques,” 2024. [Online]. Available: www.ijacsa.thesai.org

[4] M. I. Baig and E. Yadegaridehkordi, “ChatGPT in the higher education: A systematic literature review and research challenges,” Int J Educ Res, vol. 127, Jan. 2024, doi: 10.1016/j.ijer.2024.102411.

[5] O. Loyola-Gonzalez, “Black-box vs. White-Box: Understanding their advantages and weaknesses from a practical point of view,” 2019, Institute of Electrical and Electronics Engineers Inc. doi: 10.1109/ACCESS.2019.2949286.

[6] S. Wang, M. A. Qureshi, L. Miralles-Pechuan, T. Huynh-The, T. R. Gadekallu, and M. Liyanage, “Explainable AI for 6G Use Cases: Technical Aspects and Research Challenges,” IEEE Open Journal of the Communications Society, vol. 5, pp. 2490–2540, 2024, doi: 10.1109/OJCOMS.2024.3386872.

[7] H. Li, P. Wang, C. Shen, and A. Van Den Hengel, “Visual Question Answering as Reading Comprehension.”

[8] J. Hussain et al., “Model-based adaptive user interface based on context and user experience evaluation,” Journal on Multimodal User Interfaces, vol. 12, no. 1, pp. 1–16, Mar. 2018, doi: 10.1007/s12193-018-0258-2.

[9] X. Xu et al., “A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis,” Mar. 01, 2024, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/bioengineering11030219.

[10] S. Hinterstoisser, V. Lepetit, P. Wohlhart, and K. Konolige, “On Pre-Trained Image Features and Synthetic Images for Deep Learning.”

[11] R. Ashraf et al., “Deep Convolution Neural Network for Big Data Medical Image Classification,” IEEE Access, vol. 8, pp. 105659–105670, 2020, doi: 10.1109/ACCESS.2020.2998808.

[12] S. Jamil, M. Jalil Piran, and O. J. Kwon, “A Comprehensive Survey of Transformers for Computer Vision,” May 01, 2023, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/drones7050287.

[13] “Multimodal artificial intelligence in industry: Integrating text, image, and audio for enhanced applications across sectors 1 Dimple Patil.”

[14] R. Javan, T. Kim, and N. Mostaghni, “GPT-4 Vision: Multi-Modal Evolution of ChatGPT and Potential Role in Radiology,” Cureus, Aug. 2024, doi: 10.7759/cureus.68298.

[15] O. V. Johnson, O. Mohammed Alyasiri, D. Akhtom, and O. E. Johnson, “Image Analysis through the lens of ChatGPT-4,” Journal of Applied Artificial Intelligence, vol. 4, no. 2, pp. 31–46, Dec. 2023, doi: 10.48185/jaai.v4i2.870.

[16] A. Adadi and M. Berrada, “Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI),” IEEE Access, vol. 6, pp. 52138–52160, Sep. 2018, doi: 10.1109/ACCESS.2018.2870052.

[17] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, “A survey of methods for explaining black box models,” ACM Comput Surv, vol. 51, no. 5, Sep. 2018, doi: 10.1145/3236009.

[18] V. Hassija et al., “Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence,” Jan. 01, 2024, Springer. doi: 10.1007/s12559-023-10179-8.

[19] E. Akça and Ö. Ö. Tanrıöver, “Exploring Multimodal Large Language Models ChatGPT-4 and Bard for Visual Complexity Evaluation of Mobile User Interfaces,” Traitement du Signal, vol. 41, no. 5, pp. 2673–2681, Oct. 2024, doi: 10.18280/ts.410540.

[20] S. Bresciani and M. J. Eppler, “The pitfalls of visual representations: A review and classification of common errors made while designing and interpreting visualizations,” Sage Open, vol. 5, no. 4, Dec. 2015, doi: 10.1177/2158244015611451.

Downloads

Published

2025-10-30