Going forward, the CSS Speech Module seems like a better place for auditory tonal indicators. The CSS we’ve already had for years should be a better place for visual presentation.
This leaves only a minuscule semantic difference between <i>
and <em>
, or <b>
and <strong>
, as outlined in the HTML Living Standard. I don’t think that difference warrants extra elements in the HTML standard: the extra elements likely create more confusion than actual benefit. Over the past decade, I’m unaware of any user-agents treating them differently enough, in a way that aligns with author intent, to matter.
I personally just avoid <i>
and <b>
when authoring. The complexity is more trouble than it’s worth.