Various methods have been proposed to characterize the functional connectivity between nodes in a network measured with different modalities (electrophysiology, functional magnetic resonance imaging etc.). Since different measures of functional connectivity yield different results for the same dataset, it is important to assess when and how they can be used. In this work, we provide a systematic framework for evaluating the performance of a large range of functional connectivity measures-based upon a comprehensive portfolio of models generating measurable responses. Specifically, we benchmarked 42 methods using 10,000 simulated datasets from 5 different types of generative models with different connectivity structures. Since all functional connectivity methods require the setting of some parameters (window size and number, model order etc.), we first optimized these parameters using performance criteria based upon (threshold free) ROC analysis. We then evaluated the performance of the methods on data simulated with different types of models. Finally, we assessed the performance of the methods against different levels of signal-to-noise ratios and network configurations. A MATLAB toolbox is provided to perform such analyses using other methods and simulated datasets.