My understanding is that the original SMTP protocol was defined to limit transmission of characters using only 7 bits to save of transmission costs.
This protocol is almost 40 years old, and since then multiple RFCs have extended the standards.
For compatibility reasons, many if not most modern servers that are 8bit clean, perform a conversion of the message into a "7bit compatible" format, such as quoted-printable, or base64.
So technically, all the characters are 7bit ASCII.
However, the crux of my question is, even if data is encoded in a 7bit friendly way, does this mean that the physical transmission of bits between SMTP server occurs in 7bit units, or does it happen in 8bits?
My assumption is that it happens in 8bits, even if the data is encoded in ASCII. Is this correct?
Here are some relevant links I found:
<< Users send billions of 8-bit messages every year. As far as I know, all servers can handle 8-bit messages. A few years ago I was able to find a few hosts running ancient 7-bit versions of sendmail, but I don't see any now.>>
http://cr.yp.to/smtp/8bitmime.html
<< In practice, however, the body is typically encoded using all eight bits. >>
https://www.ibm.com/support/knowledgecenter/en/SSB27U_6.4.0/com.ibm.zvm.v640.kiml0/smtmlfr.htm
<< This does not cause problems in practice, since virtually all modern mail relays are 8-bit clean >>
https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol#8BITMIME
Update
The refinement of my question should be stated as: Do SMTP servers today still clear the high bit, and encode the 7bit ASCII using only the lower seven bits, or do they actually use the full octet, giving signinficance to the MSB?
I think what you are asking is: "Do SMTP clients shift bits when sending messages to an SMTP server such that each character only uses 7 bits and the 8th bit is the start of the next character?"
If so, no. That has never been the case.
Since the very beginning, SMTP clients/servers have always used all 8 bits per character.
In other words, SMTP clients and servers used the ASCII character encoding which does not include accented characters that are found in 8bit character encodings such as ISO-8859-1. Characters with a value above 127 in the ASCII encoding are treated as undefined.
There are likely a number of reasons for this: