What are the 153 top level domains starting with XN?

3.7k views Asked by At

I was inspecting the full list of IANA top level domains and came across some uncommon ones, but also some very uncommon ones, particularly 153 top level domains starting with XN:

XN--11B4C3D
XN--1CK2E1B
XN--1QQW23A
XN--2SCRJ9C
XN--30RR7Y
XN--3BST00M

What are the domains starting with XN?

Note

Here's some R code to extract the full list of XN domains for exploration:

library(tidyverse)
library(rvest)

domains <- read_html("http://data.iana.org/TLD/tlds-alpha-by-domain.txt") %>% 
  html_nodes("body") %>% 
  html_text %>% 
  str_split("\n") %>% 
  unlist %>% 
  as.data.frame %>% 
  `colnames<-`("tld")


# Starts with XN

domains %>% 
  filter(substr(tld, 1, 2) == "XN")

#                          tld
# 1                XN--11B4C3D
# 2                XN--1CK2E1B
# 3                XN--1QQW23A
# 4                XN--2SCRJ9C
# 5                 XN--30RR7Y
# 6                XN--3BST00M
# 7                XN--3DS443G
# 8               XN--3E0B707E
# 9                XN--3HCRJ9C
# 10         XN--3OQ18VL8PN36A
# 11                XN--3PXU8K
# 12               XN--42C2D9A
# 13              XN--45BR5CYL
# ---                      ---
# 146               XN--WGBL6A
# 147              XN--XHQ521B
# 148         XN--XKC2AL3HYE2A
# 149        XN--XKC2DL3A5EE0H
# 150               XN--Y9A3AQ
# 151            XN--YFRO4I67O
# 152            XN--YGBI2AMMX
# 153              XN--ZFR164B

Full set

XN--11B4C3D XN--1CK2E1B XN--1QQW23A XN--2SCRJ9C XN--30RR7Y XN--3BST00M XN--3DS443G XN--3E0B707E XN--3HCRJ9C XN--3OQ18VL8PN36A XN--3PXU8K XN--42C2D9A XN--45BR5CYL XN--45BRJ9C XN--45Q11C XN--4GBRIM XN--54B7FTA0CC XN--55QW42G XN--55QX5D XN--5SU34J936BGSG XN--5TZM5G XN--6FRZ82G XN--6QQ986B3XL XN--80ADXHKS XN--80AO21A XN--80AQECDR1A XN--80ASEHDB XN--80ASWG XN--8Y0A063A XN--90A3AC XN--90AE XN--90AIS XN--9DBQ2A XN--9ET52U XN--9KRT00A XN--B4W605FERD XN--BCK1B9A5DRE4C XN--C1AVG XN--C2BR7G XN--CCK2B3B XN--CCKWCXETD XN--CG4BKI XN--CLCHC0EA0B2G2A9GCD XN--CZR694B XN--CZRS0T XN--CZRU2D XN--D1ACJ3B XN--D1ALF XN--E1A4C XN--ECKVDTC9D XN--EFVY88H XN--FCT429K XN--FHBEI XN--FIQ228C5HS XN--FIQ64B XN--FIQS8S XN--FIQZ9S XN--FJQ720A XN--FLW351E XN--FPCRJ9C3D XN--FZC2C9E2C XN--FZYS8D69UVGM XN--G2XX48C XN--GCKR3F0F XN--GECRJ9C XN--GK3AT1E XN--H2BREG3EVE XN--H2BRJ9C XN--H2BRJ9C8C XN--HXT814E XN--I1B6B1A6A2E XN--IMR513N XN--IO0A7I XN--J1AEF XN--J1AMH XN--J6W193G XN--JLQ480N2RG XN--JLQ61U9W7B XN--JVR189M XN--KCRX77D1X4A XN--KPRW13D XN--KPRY57D XN--KPUT3I XN--L1ACC XN--LGBBAT1AD8J XN--MGB9AWBF XN--MGBA3A3EJT XN--MGBA3A4F16A XN--MGBA7C0BBN0A XN--MGBAAKC7DVF XN--MGBAAM7A8H XN--MGBAB2BD XN--MGBAH1A3HJKRD XN--MGBAI9AZGQP6J XN--MGBAYH7GPA XN--MGBBH1A XN--MGBBH1A71E XN--MGBC0A9AZCG XN--MGBCA7DZDO XN--MGBCPQ6GPA1A XN--MGBERP4A5D4AR XN--MGBGU82A XN--MGBI4ECEXP XN--MGBPL2FH XN--MGBT3DHD XN--MGBTX2B XN--MGBX4CD0AB XN--MIX891F XN--MK1BU44C XN--MXTQ1M XN--NGBC5AZD XN--NGBE9E0A XN--NGBRX XN--NODE XN--NQV7F XN--NQV7FS00EMA XN--NYQY26A XN--O3CW4H XN--OGBPF8FL XN--OTU796D XN--P1ACF XN--P1AI XN--PGBS0DH XN--PSSY2U XN--Q7CE6A XN--Q9JYB4C XN--QCKA1PMC XN--QXA6A XN--QXAM XN--RHQV96G XN--ROVU88B XN--RVC1E0AM3E XN--S9BRJ9C XN--SES554G XN--T60B56A XN--TCKWE XN--TIQ49XQYJ XN--UNUP4Y XN--VERMGENSBERATER-CTB XN--VERMGENSBERATUNG-PWB XN--VHQUV XN--VUQ861B XN--W4R85EL8FHU5DNRA XN--W4RS40L XN--WGBH1C XN--WGBL6A XN--XHQ521B XN--XKC2AL3HYE2A XN--XKC2DL3A5EE0H XN--Y9A3AQ XN--YFRO4I67O XN--YGBI2AMMX XN--ZFR164B
2

There are 2 answers

1
velvetkevorkian On BEST ANSWER

They're punycode versions of non-ASCII URLs, e.g., for example

% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

domain:       ଭାରତ
domain-ace:   XN--3HCRJ9C
0
stevec On

The domains are punycode:

Punycode is a simple and efficient transfer encoding syntax designed for use with Internationalized Domain Names in Applications. It uniquely and reversibly transforms a Unicode string into an ASCII string. ASCII characters in the Unicode string are represented literally, and non-ASCII characters are represented by ASCII characters that are allowed in host name labels (letters, digits, and hyphens).

Here are all 153 punycode domains after decoding:

  [1] "कॉम"               "セール"            "佛山"             
  [4] "ಭಾರತ"              "慈善"              "集团"             
  [7] "在线"              "한국"              "ଭାରତ"             
 [10] "点看"              "คอม"               "ভাৰত"             
 [13] "ভারত"              "八卦"              "ישראל"            
 [16] "موقع"              "বাংলা"             "公益"             
 [19] "公司"              "香格里拉"          "网站"             
 [22] "移动"              "我爱你"            "москва"           
 [25] "қаз"               "католик"           "онлайн"           
 [28] "сайт"              "联通"              "срб"              
 [31] "бг"                "бел"               "קום"              
 [34] "时尚"              "微博"              "淡马锡"           
 [37] "ファッション"      "орг"               "नेट"               
 [40] "ストア"            "アマゾン"          "삼성"             
 [43] "சிங்கப்பூர்"          "商标"              "商店"             
 [46] "商城"              "дети"              "мкд"              
 [49] "ею"                "ポイント"          "新闻"             
 [52] "家電"              "كوم"               "中文网"           
 [55] "中信"              "中国"              "中國"             
 [58] "娱乐"              "谷歌"              "భారత్"              
 [61] "ලංකා"              "電訊盈科"          "购物"             
 [64] "クラウド"          "ભારત"              "通販"             
 [67] "भारतम्"             "भारत"              "भारोत"            
 [70] "网店"              "संगठन"              "餐厅"             
 [73] "网络"              "ком"               "укр"              
 [76] "香港"              "亚马逊"            "诺基亚"           
 [79] "食品"              "飞利浦"            "台湾"             
 [82] "台灣"              "手机"              "мон"              
 [85] "الجزائر"           "عمان"              "ارامكو"           
 [88] "ایران"             "العليان"           "اتصالات"          
 [91] "امارات"            "بازار"             "موريتانيا"        
 [94] "پاکستان"           "الاردن"            "بارت"             
 [97] "بھارت"             "المغرب"            "ابوظبي"           
[100] "البحرين"           "السعودية"          "ڀارت"             
[103] "كاثوليك"           "سودان"             "همراه"            
[106] "عراق"              "مليسيا"            "澳門"             
[109] "닷컴"              "政府"              "شبكة"             
[112] "بيتك"              "عرب"               "გე"               
[115] "机构"              "组织机构"          "健康"             
[118] "ไทย"               "سورية"             "招聘"             
[121] "рус"               "рф"                "تونس"             
[124] "大拿"              "ລາວ"               "みんな"           
[127] "グーグル"          "ευ"                "ελ"               
[130] "世界"              "書籍"              "ഭാരതം"            
[133] "ਭਾਰਤ"              "网址"              "닷넷"             
[136] "コム"              "天主教"            "游戏"             
[139] "VERMöGENSBERATER"  "VERMöGENSBERATUNG" "企业"             
[142] "信息"              "嘉里大酒店"        "嘉里"             
[145] "مصر"               "قطر"               "广东"             
[148] "இலங்கை"             "இந்தியா"            "հայ"              
[151] "新加坡"            "فلسطين"            "政务"      

And here is the R code used to decode the domains (but you can also use a punycode converter:

library(tidyverse)
library(rvest)

domains <- read_html("http://data.iana.org/TLD/tlds-alpha-by-domain.txt") %>% 
  html_nodes("body") %>% 
  html_text %>% 
  str_split("\n") %>% 
  unlist %>% 
  as.data.frame %>% 
  `colnames<-`("tld")


punycode_domains <- domains %>% 
  filter(substr(tld, 1, 2) == "XN") %>% 
  pull(tld)


# devtools::install_github("hrbrmstr/punycode") # Run once to install
library(punycode)

puny_encode(punycode_domains)