We report here that monkeys can actively match the number of sounds they hear to the number of shapes they see and present the first evidence that monkeys sum over sounds and sights. In Experiment 1, two monkeys were trained to choose a simultaneous array of 1–9 squares that numerically matched a sample sequence of shapes or sounds. Monkeys numerically matched across (audio–visual) and within (visual–visual) modalities with equal accuracy and transferred to novel numerical values. In Experiment 2, monkeys presented with sample sequences of randomly ordered shapes or tones were able to choose an array of 2–9 squares that was the numerical sum of the shapes and sounds in the sample sequence. In both experiments, accuracy and reaction time depended on the ratio between the correct numerical match and incorrect choice. These findings suggest monkeys and humans share an abstract numerical code that can be divorced from the modality in which stimuli are first experienced.