Main Article Content


Worldwide, breast cancer (BC) represents one of the serious health concerns for adult females. The early detection and accurate prediction of risks are vital for the provision of optimum care and enhancement of patient outcomes. In the past few years, promising large data merging and ensemble learning algorithms appeared for the purpose of classification and prediction of BC risk. In the area of medical applications, methods of machine learning (ML) are crucial. Early diagnosis is necessary for a more efficient carcinoma treatment. This study’s aim is to classify the carcinoma with the use of the 10 predictors that are found in Breast Cancer Coimbra dataset (BCCD). Presently, early diagnoses are necessary. The rates of cancer survival could be raised in the case where it is discovered early. Methods of machine learning offer effective way for data classifying and making early disease diagnoses. This study utilizes BCCD for the classification of BC cases utilizing XGBoost algorithm. Based on performance criteria, early detection of BC is the primary goal. The XGBoost classifier in this research achieved 98% precision, 98.32% accuracy, 99% f1-score, and 97% recall.


Machine Learning XGBoost Z-score BCCD

Article Details

How to Cite
A. S. Jaddoa, “Early Breast Cancer Detection in Coimbra Dataset Using Supervised Machine Learning (XGBoost)”, bit-cs, vol. 5, no. 2, pp. 85-89, Jun. 2024.


  1. [1] S. Aamir et al., “Predicting Breast Cancer Leveraging Supervised Machine Learning Techniques,” Comput. Math. Methods Med., vol. 2022, 2022, doi: 10.1155/2022/5869529.
  2. [2] R. Gonzales Martinez and D. M. van Dongen, “Deep learning algorithms for the early detection of breast cancer: A comparative study with traditional machine learning,” Informatics Med. Unlocked, vol. 41, no. April, p. 101317, 2023, doi: 10.1016/j.imu.2023.101317.
  3. [3] Y. Amethiya, P. Pipariya, S. Patel, and M. Shah, “Comparative analysis of breast cancer detection using machine learning and biosensors,” Intell. Med., vol. 2, no. 2, pp. 69–81, 2022, doi: 10.1016/j.imed.2021.08.004.
  4. [4] G. Alfian et al., “Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method,” Computers, vol. 11, no. 9, 2022, doi: 10.3390/computers11090136.
  5. [5] A. sami Jaddoa, S. J. Saba, and E. A.Abd Al-Kareem, “Liver Disease Prediction Model Based on Oversampling Dataset with RFE Feature Selection using ANN and AdaBoost algorithms,” Buana Information Technology and Computer Sciences (BIT and CS), vol. 4, no. 2. pp. 85–93, 2023, doi: 10.36805/bit-cs.v4i2.5565.
  6. [6] C. Li, Y. Weng, Y. Zhang, and B. Wang, “A Systematic Review of Application Progress on Machine Learning-Based Natural Language Processing in Breast Cancer over the Past 5 Years,” Diagnostics, vol. 13, no. 3, 2023, doi: 10.3390/diagnostics13030537.
  7. [7] Ahmed Sami Jaddoa, “Heart Disease Prediction System Using (SMOTE Technique),” vol. 050006, 2023.
  8. [8] S. B. Sakri, N. B. Abdul Rashid, and Z. Muhammad Zain, “Particle Swarm Optimization Feature Selection for Breast Cancer Recurrence Prediction,” IEEE Access, vol. 6, pp. 29637–29647, 2018, doi: 10.1109/ACCESS.2018.2843443.
  9. [9] M. Kaya Keleş, “Breast cancer prediction and detection using data mining classification algorithms: A comparative study,” Teh. Vjesn., vol. 26, no. 1, pp. 149–155, 2019, doi: 10.17559/TV-20180417102943.
  10. [10] P. Ferroni, F. M. Zanzotto, S. Riondino, N. Scarpato, F. Guadagni, and M. Roselli, “Breast cancer prognosis using a machine learning approach,” Cancers (Basel)., vol. 11, no. 3, pp. 1–9, 2019, doi: 10.3390/cancers11030328.
  11. [11] S. B. Akben, “Determination of the Blood, Hormone and Obesity Value Ranges that Indicate the Breast Cancer, Using Data Mining Based Expert System,” Irbm, vol. 40, no. 6, pp. 355–360, 2019, doi: 10.1016/j.irbm.2019.05.007.
  12. [12] S. Nanglia, M. Ahmad, F. Ali Khan, and N. Z. Jhanjhi, “An enhanced Predictive heterogeneous ensemble model for breast cancer prediction,” Biomed. Signal Process. Control, vol. 72, no. July 2021, p. 103279, 2022, doi: 10.1016/j.bspc.2021.103279.
  13. [13] M. M. Hassan et al., “A comparative assessment of machine learning algorithms with the Least Absolute Shrinkage and Selection Operator for breast cancer detection and prediction,” Decis. Anal. J., vol. 7, no. May, p. 100245, 2023, doi: 10.1016/j.dajour.2023.100245.
  14. [14] Z. Salod and Y. Singh, “Comparison of the performance of machine learning algorithms in breast cancer screening and detection: A protocol,” J. Public health Res., vol. 8, no. 3, pp. 112–118, 2019, doi: 10.4081/jphr.2019.1677.

DB Error: Unknown column 'Array' in 'where clause'