Abstract Over the past 10 years asynchronous design has re-emerged as a viable alternative to clocked design with mounting evidence of competitive performance, power efficiency and electromagnetic compatibility compared to the more mainstream synchronous style. This paper presents an asynchronous cache architecture—the logical choice for use with an asynchronous microprocessor where a synchronous cache compromises the asynchronous benefits of the processor. The design presented here provides the processor with a unified, dual-ported view of its memory subsystem using multiple interleaved blocks. Each block has separate instruction and data line-buffers acting as level-zero (L0) cache making cache access time highly variable. The cache employs a copy-back write strategy to support a high-performance embedded processor core. The design is optimised for the Amulet3 microprocessor core, the third fully asynchronous implementation of the ARM architecture, but the techniques employed are generally applicable to any asynchronous processor.