BackgroundA strong correlation between breast cancer (BC) molecular subtypes and axillary status has been shown. It would be useful to predict the probability of lymph node (LN) positivity. Objective: To develop the performance of multivariable models to predict LN metastases, including nomograms derived from logistic regression with clinical, pathologic variables provided by tumor surgical results or only by biopsy.MethodsA retrospective cohort was randomly divided into two separate patient sets: a training set and a validation set. In the training set, we used multivariable logistic regression techniques to build different predictive nomograms for the risk of developing LN metastases. The discrimination ability and calibration accuracy of the resulting nomograms were evaluated on the training and validation set.ResultsConsecutive sample of 12,572 early BC patients with sentinel node biopsies and no neoadjuvant therapy. In our predictive macro metastases LN model, the areas under curve (AUC) values were 0.780 and 0.717 respectively for pathologic and pre-operative model, with a good calibration, and results with validation data set were similar: AUC respectively of 0.796 and 0.725.Among the list of candidate’s regression variables, on the training set we identified age, tumor size, LVI, and molecular subtype as statistically significant factors for predicting the risk of LN metastases.ConclusionsSeveral nomograms were reported to predict risk of SLN involvement and NSN involvement. We propose a new calculation model to assess this risk of positive LN with similar performance which could be useful to choose management strategies, to avoid axillary LN staging or to propose ALND for patients with high level probability of major axillary LN involvement but also to propose immediate breast reconstruction when post mastectomy radiotherapy is not required for patients without LN macro metastasis.