Abstract: Recognising continuous emotions and action unit (AU) intensities from face videos, requires a spatial and temporal understanding of expression dynamics. Existing works primarily rely on 2D ...