Segmental speech units (e.g. phonemes) are described as multidimensional categories wherein perception involves contributions from multiple acoustic input dimensions, and the relative perceptual weights of these dimensions respond dynamically to context. Can prosodic aspects of speech spanning multiple phonemes, syllables or words be characterized similarly? Here we investigated the relative contribution of two acoustic dimensions to word emphasis. Participants categorized instances of a two-word phrase pronounced with typical covariation of fundamental frequency (F0) and duration, and in the context of an artificial accent in which F0 and duration covaried atypically. When categorizing accented speech, listeners rapidly down-weighted the secondary dimension (duration) while continuing to rely on the primary dimension (F0). This clarifies two core theoretical questions: 1) prosodic categories are signalled by multiple input acoustic dimensions and 2) perceptual cue weights for prosodic categories dynamically adapt to local regularities of speech input.

HighlightsO_LIProsodic categories are signalled by multiple acoustic dimensions.
C_LIO_LIThe influence of these dimensions flexibly adapts to changes in local speech input.
C_LIO_LIThis adaptive plasticity may help tune perception to atypical accented speech.
C_LIO_LISimilar learning models may account for segmental and suprasegmental flexibility.