The list is kept in source code control at mozilla.org. Top-level domains are listed in alphabetical order. You can read more information on the format the list uses below. Please note that the list is encoded using UTF-8.
To be kept informed of changes to the list, you can subscribe to an Atom change feed in your favourite feed reader.
List format
A public suffix is a set of DNS names or wildcards concatenated with dots. It represents the part of a domain name which is not under the control of the individual registrant.
Specification
The list is a set of rules, with one rule per line.
Each line is only read up to the first whitespace; entire lines can also be commented using //.
Each line which is not entirely whitespace or begins with a comment contains a rule.
Each rule lists a public suffix, with the subdomain portions separated by dots (.) as usual. There is no leading dot.
The wildcard character * (asterisk) matches any valid sequence of characters in a hostname part. (Note: the list uses Unicode, not Punycode forms, and is encoded using UTF-8.)
Wildcards may only be used to wildcard an entire level. That is, they must be surrounded by dots (or implicit dots, at the beginning of a line).
If a hostname matches more than one rule in the file, the longest matching rule (the one with the most levels) will be used.
An exclamation mark (!) at the start of a rule marks an exception to a previous wildcard rule. An exception rule takes priority over any other matching rule.
Example
Here is an example (incomplete) list section. The rules are numbered, but the numbers would not appear in the real file:
1. com
2. *.jp
// Hosts in .hokkaido.jp can't set cookies below level 4...
3. *.hokkaido.jp
4. *.tokyo.jp
// ...except hosts in pref.hokkaido.jp, which can set cookies at level 3.
5. !pref.hokkaido.jp
6. !metro.tokyo.jp
The example above would be interpreted as follows, in the case of cookie-setting, and using "foo" and "bar" as generic hostnames:
Cookies may be set for foo.com.
Cookies may be set for foo.bar.jp.
Cookies may not be set for bar.jp.
Cookies may be set for foo.bar.hokkaido.jp.
Cookies may not be set for bar.hokkaido.jp.
Cookies may be set for foo.bar.tokyo.jp.
Cookies may not be set for bar.tokyo.jp.
Cookies may be set for pref.hokkaido.jp because the exception overrides the previous rule.
Cookies may be set for metro.tokyo.jp, because the exception overrides the previous rule.
Formal algorithm
Here is an algorithm for determining the Public Suffix of a domain. (Note: it may not be the most efficient algorithm.) The domain and all rules must be canonicalized in the normal way for hostnames - lower-case, Punycode (RFC 3492).
Definitions
The Public Suffix List consists of a series of lines, separated by \n.
Each line is only read up to the first whitespace; entire lines can also be commented using //.
Each line which is not entirely whitespace or begins with a comment contains a rule.
A rule may begin with a "!" (exclamation mark). If it does, it is labelled as a "exception rule" and then treated as if the exclamation mark is not present.
A domain or rule can be split into a list of labels using the separator "." (dot). The separator is not part of any of the labels.
A domain is said to match a rule if, when the domain and rule are both split, and one compares the labels from the rule to the labels from the domain, beginning at the right hand end, one finds that for every pair either they are identical, or that the label from the rule is "*" (star). The domain may legitimately have labels remaining at the end of this matching process.
Algorithm
Match domain against all rules and take note of the matching ones.
If no rules match, the prevailing rule is "*".
If more than one rule matches, the prevailing rule is the one which is an exception rule.
If there is no matching exception rule, the prevailing rule is the one with the most labels.
If the prevailing rule is a exception rule, modify it by removing the leftmost label.
The public suffix is the set of labels from the domain which directly match the labels of the prevailing rule (joined by dots).
The registered or registrable domain is the public suffix plus one additional label.
Test Data
There is a short set of test data available.
You will need to define a checkPublicSuffix() function which
takes as a parameter a domain name and the public suffix, runs your implementation
on the domain name and checks the result is the public suffix expected.
Thanks to Rob Stradling of Comodo for
providing this test data.