Abstract This paper addresses the problem of extracting view-invariant visual features for the recognition of object-directed actions and introduces a computational model of how these visual features are processed in the brain. In particular, in the test-bed setting of reach-to-grasp actions, grip aperture is identified as a good candidate for inclusion into a parsimonious set of hand high-level features describing overall hand movement during reach-to-grasp actions. The computational model NeGOI (neural network architecture for measuring grip aperture in an observer-independent way) for extracting grip aperture in a view-independent fashion was developed on the basis of functional hypotheses about cortical areas that are involved in visual processing. An assumption built into NeGOI is that grip aperture can be measured from the superposition of a small number of prototypical hand shapes corresponding to predefined grip-aperture sizes. The key idea underlying the NeGOI model is to introduce view-independent units ( VIP units) that are selective for prototypical hand shapes, and to integrate the output of VIP units in order to compute grip aperture. The distinguishing traits of the NEGOI architecture are discussed together with results of tests concerning its view-independence and grip-aperture recognition properties. The overall functional organization of NEGOI model is shown to be coherent with current functional models of the ventral visual stream, up to and including temporal area STS. Finally, the functional role of the NeGOI model is examined from the perspective of a biologically plausible architecture which provides a parsimonious set of high-level and view-independent visual features as input to mirror systems.