Email address

Example of an email address.

An email address identifies an email box to which email messages are delivered. A wide variety of formats were used in early email systems, but only a single format is used today, following the standards developed for Internet mail systems since the 1980s. This article uses the term email address to refer to the addr-spec defined in RFC 5322, not to the address that is commonly used; the difference is that an address may contain a display name, a comment, or both.

An email address such as [email protected] is made up of a local-part, an @ symbol, then a case-insensitive domain. Although the standard specifies the local part to be case-sensitive, in practice the mail system at example.com may treat John.Smith as equivalent to JohnSmith or even as johnsmith,[1] and mail systems often limit their users' choice of name to a subset of the technically valid characters. In some cases they also limit which addresses it is possible to send mail to.

With the introduction of internationalized domain names, efforts are progressing to permit non-ASCII characters in email addresses.

Overview

The transmission of electronic mail within the Internet uses the Simple Mail Transfer Protocol (SMTP), defined in Internet standards RFC 5321 and RFC 5322, and extensions like RFC 6531. The mailboxes may be accessed and managed by users with the Post Office Protocol (POP) or the Internet Message Access Protocol (IMAP) with email client software that runs on a personal computer, mobile device, or with webmail systems that render the messages on a screen or on paper printouts.

The general format of an email address is local-part@domain, and a specific example is [email protected]. An address consists of two parts. The part before the @ symbol (local-part) identifies the name of a mailbox. This is often the username of the recipient, e.g., jsmith. The part after the @ symbol (domain) is a domain name that represents the administrative realm for the mail box, e.g., a company's domain name, example.com.

When delivering email, a mail server uses the domain name system (DNS) to look up the mail exchanger record (MX record) for the recipient's domain (the part of the email address on the right of @). The returned MX record contains the name of the recipient's mailserver. The MTA next connects to this server as an SMTP client.

The local part of an email address has no significance for intermediate mail relay systems other than the final mailbox host. Email senders and intermediate relay systems must not assume it to be case-insensitive, since the final mailbox host may or may not treat it as such. A single mailbox may receive mail for multiple email addresses, if configured by the administrator. Conversely, a single email address may be the alias to a distribution list to many mailboxes. Email aliases, electronic mailing lists, sub-addressing, and catch-all addresses, the latter being mailboxes that receive messages regardless of the local part, are common patterns for achieving a variety of delivery goals.

The addresses found in the header fields of an email message are not directly used by mail exchangers to deliver the message. An email message also contains a message envelope that contains the information for mail routing. While envelope and header addresses may be equal, forged email addresses are often seen in spam, phishing, and many other Internet-based scams. This has led to several initiatives which aim to make such forgeries easier to spot.

To indicate the message recipient, an email address also may have an associated display name for the recipient, which is followed by the address specification surrounded by angled brackets, for example: John Smith <[email protected]>.

Earlier forms of email addresses on other networks than the Internet included other notations, such as that required by X.400, and the UUCP bang path notation, in which the address was given in the form of a sequence of computers through which the message should be relayed. This was widely used for several years, but was superseded by the Internet standards promulgated by the Internet Engineering Task Force (IETF).

Syntax

The format of email addresses is local-part@domain where the local part may be up to 64 characters long and the domain may have a maximum of 255 characters[2]—but the maximum of 256-character length of a forward or reverse path restricts the entire email address to be no more than 254 characters long.[3] The formal definitions are in RFC 5322 (sections 3.2.3 and 3.4.1) and RFC 5321—with a more readable form given in the informational RFC 3696[4] and the associated errata.

Local-part

The local-part of the email address may use any of these ASCII characters:

In addition to the above ASCII characters, international characters above U+007F, encoded as UTF-8, are permitted by RFC 6531, though mail systems may restrict which characters to use when assigning local-parts.

A quoted string may exist as a dot separated entity within the local-part, or it may exist when the outermost quotes are the outermost characters of the local-part (e.g., abc."defghi"[email protected] or "abcdefghixyz"@example.com are allowed. Conversely, abc"defghi"[email protected] is not; neither is abc\"def\"[email protected]). Quoted strings and characters however, are not commonly used. RFC 5321 also warns that "a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form".

The local-part postmaster is treated specially—it is case-insensitive, and should be forwarded to the domain email administrator. Technically all other local-parts are case-sensitive, therefore [email protected] and [email protected] specify different mailboxes; however, many organizations treat uppercase and lowercase letters as equivalent.

Despite the wide range of special characters which are technically valid; organisations, mail services, mail servers and mail clients in practice often do not accept all of them. For example, Windows Live Hotmail only allows creation of email addresses using alphanumerics, dot (.), underscore (_) and hyphen (-).[5] Common advice is to avoid using some special characters to avoid the risk of rejected emails.[6]

Domain

The domain name part of an email address has to conform to strict guidelines: it must match the requirements for a hostname, a list of dot-separated DNS labels, each label being limited to a length of 63 characters and consisting of:[7]

This rule is known as the LDH rule (letters, digits, hyphen). In addition, the domain may be an IP address literal, surrounded by square brackets [], such as jsmith@[192.168.2.1] or jsmith@[IPv6:2001:db8::1], although this is rarely seen except in email spam. Internationalized domain names (which are encoded to comply with the requirements for a hostname) allow for presentation of non-ASCII domains. In mail systems compliant with RFC 6531 and RFC 6532 an email address may be encoded as UTF-8, both a local-part as well as a domain name.

Comments are allowed in the domain as well as in the local-part; for example, john.smith@(comment)example.com and [email protected](comment) are equivalent to [email protected].

Examples

Valid email addresses
[email protected]
[email protected]
[email protected]
[email protected]
[email protected] (one-letter local-part)
"much.more unusual"@example.com
"[email protected]"@example.com
"very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com
[email protected]
admin@mailserver1 (local domain name with no TLD)
#!$%&'*+-/=?^_`{}|[email protected]
"()<>[]:,;@\\\"!#$%&'-/=?^_`{}| ~.a"@example.org
" "@example.org (space between the quotes)
example@localhost (sent from localhost)
[email protected] (see the List of Internet top-level domains)
user@localserver
user@tt (although ICANN highly discourages dotless email addresses)
user@[IPv6:2001:DB8::1]
Invalid email addresses
Abc.example.com (no @ character)
A@b@[email protected] (only one @ is allowed outside quotation marks)
a"b(c)d,e:f;g<h>i[j\k][email protected] (none of the special characters in this local-part are allowed outside quotation marks)
just"not"[email protected] (quoted strings must be dot separated or the only element making up the local-part)
this is"not\[email protected] (spaces, quotes, and backslashes may only exist when within quoted strings and preceded by a backslash)
this\ still\"not\\[email protected] (even if escaped (preceded by a backslash), spaces, quotes, and backslashes must still be contained by quotes)
1234567890123456789012345678901234567890123456789012345678901234+x@example.com (too long)
[email protected] (double dot before @)
with caveat: Gmail lets this through, Email address#Local-part the dots altogether
[email protected] (double dot after @)
a valid address with a leading space
a valid address with a trailing space

Common local-part semantics

According to RFC 5321 2.3.11 Mailbox and Address, "...the local-part MUST be interpreted and assigned semantics only by the host specified in the domain of the address." This means that no assumptions can be made about the meaning of the local-part of another mail server. It is entirely up to the configuration of the mail server.

Local-part normalization

Interpretation of the local part of an email address is dependent on the conventions and policies implemented in the mail server. For example, case sensitivity may distinguish mailboxes differing only in capitalization of characters of the local-part, although this is not very common.[8] Gmail ignores all dots in the local-part for the purposes of determining account identity.[9] This prevents the creation of user accounts your.user.name or yourusername when the account your.username already exists.

Sub-addressing

Some mail services support a tag appended to the local-part, such that the modified address is an alias to the unmodified one. For example, the address [email protected] denotes the same delivery address as [email protected]. RFC 5233, refers to this convention as sub-addressing, but it is also known as plus addressing or tagged addressing.

Addresses of this form, using various separators between the base name and the tag, are supported by several email services, including Runbox (plus), Gmail (plus),[10] Yahoo! Mail Plus (hyphen),[11] Apple's iCloud (plus), Outlook.com (plus),[12] FastMail (plus and Subdomain Addressing),[13] MMDF (equals), Qmail and Courier Mail Server (hyphen).[14][15] Postfix allows configuring an arbitrary separator from the legal character set.[16]

The text of the tag may be used to apply filtering,[14] or to create single-use, or disposable email addresses.[17]

In practice, the form validation of some web sites may reject special characters such as "+" in an email address – treating them, (incorrectly), as invalid characters. This can lead to an incorrect user receiving an e-mail if the "+" is silently stripped by a website without any warning or error messages. For example, an email intended for the user-entered email address [email protected] could be incorrectly sent to [email protected]. In other cases a poor user experience can occur if some parts of a site, such as a user registration page, allow the "+" character whilst other parts, such as a page for unsubscribing from a site's mailing list, do not.

Validation and verification

Email addresses are often requested as input to website as user identification for the purpose of data validation. While there are companies that provide services to validate an email address at the time of entry, normally using an Application programming interface, there is no guarantee that it will provide accurate results.[18]

An email address is generally recognized as having two parts joined with an at-sign (@). However, the technical specification detailed in RFC 822 and subsequent RFCs are more extensive.[19] A regular expression can be used to check for all of these criteria, except that of bracketed nested comments.[20]

Syntactically correct, verified email addresses do not guarantee email box existence. Thus many mail servers use other techniques and check the mailbox existence against relevant systems such as the Domain Name System for the domain or using callback verification to check if the mailbox exists. This is however often disabled to avoid directory harvest attack.

Assuring an email address is of a good quality requires a combination of various validation techniques. Large websites, bulk mailers and spammers require fast algorithms that predict validity of email address. Such methods depend heavily on heuristic algorithms and statistical models.[21]

Many websites evaluate the validity of email addresses differently than the standards specify, rejecting addresses containing valid characters, such as + and /, or enforcing arbitrary length limitations. RFC 3696 provides specific advice for validating Internet identifiers, including email addresses.

HTML5 forms implemented in many browsers, allow email address validation to be handled by the browser.[22]

Email address internationalization provides for a much larger range of characters than many current validation algorithms allow, such as all Unicode characters above U+0080, encoded as UTF-8.

Identity validation

Email addresses are the primary means of account activation (user identification and validation on websites), but other methods are available, such as cell phone number validation, postal mail validation, fax validation. Email address validation is accomplished by the website sending an email with a special temporary hyperlink to the user-provided email address. On receipt, the user opens the link, immediately activating the account. Email addresses are also useful as means of forwarding messages from a website, e.g., user messages, user actions, to the email inbox.

Internationalization

The IETF conducts a technical and standards working group devoted to internationalization issues of email addresses, entitled Email Address Internationalization (EAI, also known as IMA, Internationalized Mail Address).[23] This group produced RFC 6530, RFC 6531, RFC 6532, and RFC 6533, and continues to work on additional EAI-related RFCs.

The IETF's EAI Working group published RFC 6530 "Overview and Framework for Internationalized Email", which enabled non-ASCII characters to be used in both the local-parts and domain of an email address. RFC 6530 provides for email based on the UTF-8 encoding, which permits the full repertoire of Unicode. RFC 6531 provides a mechanism for SMTP servers to negotiate transmission of the SMTPUTF8 content.

The basic EAI concepts involve exchanging mail in UTF-8. Though the original proposal included a downgrading mechanism for legacy systems, this has now been dropped.[24] The local servers are responsible for the local-part of the address, whereas the domain would be restricted by the rules of internationalized domain names, though still transmitted in UTF-8. The mail server is also responsible for any mapping mechanism between the IMA form and any ASCII alias.

EAI enables users to have a localized address in a native language script or character set, as well as an ASCII form for communicating with legacy systems or for script-independent use. Applications that recognize internationalized domain names and mail addresses must have facilities to convert these representations.

Significant demand for such addresses is expected in China, Japan, Russia, and other markets that have large user bases in a non-Latin-based writing system. For example, in addition to the .in top-level domain, the government of India in 2011[25] got approval for ".bharat", (from Bhārat Gaṇarājya), written in seven different scripts[26][27] for use by Gujrati, Marathi, Bangali, Tamil, Telugu, Punjabi and Urdu speakers.

Internationalization examples

The example addresses below would not be handled by RFC 5322 based servers, but are permitted by RFC 6530. Servers compliant with this will be able to handle these:

Internationalization support

Standards documents

See also

References

  1. "...you can add or remove the dots from a Gmail address without changing the actual destination address; and they'll all go to your inbox...", Google.com
  2. RFC 5321, section 4.5.3.1. Size Limits and Minimums explicitly details protocol limits.
  3. RFC 3696 Errata, Errata ID 1690.
  4. Written by J. Klensin, the author of RFC 5321
  5. "Sign up for Windows Live". Retrieved 2008-07-26.. However, the phrase is hidden, thus one has to either check the availability of an invalid ID, e.g. me#1, or resort to alternative displaying, e.g. no-style or source view, in order to read it.
  6. "Characters in the local part of an email address". Retrieved 2016-03-30.
  7. RFC 3696, section 2. Restrictions on domain (DNS) names
  8. Are Email Addresses Case Sensitive? by Heinz Tschabitscher
  9. "Receiving someone else's mail". google.com.
  10. "Using an address alias". google.com.
  11. https://help.yahoo.com/kb/SLN3523.html
  12. "Outlook.com supports simpler "+" email aliases too". Within Windows.
  13. "Plus addressing and subdomain addressing". fastmail.fm.
  14. 1 2 "Dot-Qmail, Control the delivery of mail messages". Retrieved 27 January 2012.
  15. Sill, Dave. "4.1.5. extension addresses". Life with qmail. Retrieved 27 January 2012.
  16. "Postfix Configuration Parameters". postfix.org.
  17. Gina Trapani (2005) "Instant disposable Gmail addresses"
  18. When a Valid and Deliverable Email is Neither Valid nor Deliverable Paul, Andrew. Email Answers. Retrieved 26 April 2013
  19. I Knew How To Validate An Email Address Until I Read The RFC
  20. Mail::RFC822::Address
  21. Verification & Validation Techniques for Email Address Quality Assurance by Jan Hornych 2011, University of Oxford
  22. "4.10 Forms — HTML5". w3.org.
  23. "Eai Status Pages". Email Address Internationalization (Active WG). IETF. March 17, 2006 – March 18, 2013. Retrieved July 26, 2008.
  24. "Email Address Internationalization (eai)". IETF. Retrieved November 30, 2010.
  25. "2011-01-25 - Approval of Delegation of the seven top-level domains representing India in various languages"
  26. "Internationalized Domain Names (IDNs) | Registry.In". registry.in. Retrieved 2016-10-17.
  27. "Now, get your email address in Hindi - The Economic Times". The Economic Times. Retrieved 2016-10-17.
  28. "'Postfix stable release 3.0.0' – MARC". marc.info.
  29. "A first step toward more global email". Google Official Blog. Google. Retrieved 6 August 2014.
  30. "What's new in Outlook 2016 for Windows", support.office.com
  31. "IDN EMAIL WEB HOSTING | XgenPlus". www.xgenplus.com. Retrieved 2016-10-17.
  32. "IDN - ICANNWiki". icannwiki.com. Retrieved 2016-10-17.
The Wikibook Coding Cookbook has a page on the topic of: Validate Email Address
The Wikibook JavaScript has a page on the topic of: Best Practices
Wikimedia Commons has media related to Email address.
This article is issued from Wikipedia - version of the 12/3/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.