Input variable selection using data-driven combined with statistical methods have received more attention to analyse the probability of freshwater organisms' occurrence. Eight different sampling sites (from the source to the mouth of Gamasiab River basin, Iran) were considered to study the occurrence of Alburnoides mossulensis during one year study period (2008-2009). A set of river characteristics together with abundance of target fish (based on presence/absence data) were recorded at each sampling site. Logistic regression was optimized with an input variable selection, greedy stepwise search algorithm, to select the most important explanatory variables for analysing the occurrence of fish. According to the optimization method, almost one-third of total recorded variables in the sampling sites including electric conductivity (EC), bicarbonate (HCO3-), river width, river depth, water temperature, pH, sulphate (SO4(2-)) and orthophosphate (PO4(3-) -P) might influence the probability of occurrence of fish in the river while based on the outcomes of binary logistic regression model, electrical conductivity and bicarbonate were the most important ones (p < 0.05 for both variables).