The perception of non-native speech is influenced by prior attunement to the native language. Evidence from auditory–only (AO) citation speech research indicates that non-native consonants are perceptually assimilated to native language categories, often causing difficulties in discrimination of non-native speech contrasts. But, as auditory-visual (AV) speech and clear speech have been shown to benefit non-native speech perception, we reasoned that modality and speaking style may also assist to clarify categorical distinctions, and goodness-of-fit differences between contrasting non-native consonants. This was tested by comparing the perceptual assimilation of Sindhi consonants by Australian English monolinguals in AO, AV, and visual-only (VO), clear and citation speech conditions. Although the consonants were perceptually assimilated to native categories similarly across AO and AV conditions, in both speaking styles, categorization consistency was dependent on whether the consonants were categorized in the VO condition. Consonants that were uncategorized in VO were more consistently categorized in AO than AV, and citation speech was more consistently categorized than clear. However, these patterns were reversed for consonants that were categorized in VO, where participants used the visual articulatory information to make more consistent categorization judgments for AV than AO, and clear speech was more consistently categorized than citation. These results suggest that the impact of AV and clear speech on cross-language perception depends on the similarities and differences between the articulatory characteristics of native and non-native consonants, and may have important implications for second language learning, as well as human and machine interaction, via the robustness of automatic speech recognition systems.