String.prototype.normalize()
Baseline
Widely available
This feature is well established and works across many devices and browser versions. Itâs been available across browsers since 2016å¹´9æ.
normalize() 㯠String å¤ã®ã¡ã½ããã§ããã®æååã® Unicode æ£è¦åå½¢å¼ãè¿ãã¾ãã
試ãã¦ã¿ã¾ããã
const name1 = "\u0041\u006d\u00e9\u006c\u0069\u0065";
const name2 = "\u0041\u006d\u0065\u0301\u006c\u0069\u0065";
console.log(`${name1}, ${name2}`);
// äºæ³ãããçµæ: "Amélie, Amélie"
console.log(name1 === name2);
// äºæ³ãããçµæ: false
console.log(name1.length === name2.length);
// äºæ³ãããçµæ: false
const name1NFC = name1.normalize("NFC");
const name2NFC = name2.normalize("NFC");
console.log(`${name1NFC}, ${name2NFC}`);
// äºæ³ãããçµæ: "Amélie, Amélie"
console.log(name1NFC === name2NFC);
// äºæ³ãããçµæ: true
console.log(name1NFC.length === name2NFC.length);
// äºæ³ãããçµæ: true
æ§æ
normalize()
normalize(form)
弿°
formçç¥å¯-
Unicode æ£è¦åå½¢å¼ã示ã
"NFC","NFD","NFKC","NFKD"ã®ãã¡ã®ä¸ã¤ã§ããçç¥ããããundefinedã§ãã£ããããå ´åã¯"NFC"ã使ããã¾ãããããã®å¤ã«ã¯ä»¥ä¸ã®æå³ãããã¾ãã
"NFC"-
æ£è¦åå½¢å¼ Cãæ£æºç価æ§ã«ãã£ã¦åè§£ãããå度åæãããã
"NFD"-
æ£è¦åå½¢å¼ Dãæ£æºç価æ§ã«ãã£ã¦åè§£ãããã
"NFKC"-
æ£è¦åå½¢å¼ KCãäºæç価æ§ã«ãã£ã¦åè§£ãããæ£æºç価æ§ã«ãã£ã¦å度åæãããã
"NFKD"-
æ£è¦åå½¢å¼ KDãäºæç価æ§ã«ãã£ã¦åè§£ãããã
è¿å¤
ä¸ããããæååã® Unicode æ£è¦åå½¢å¼ãå«ãæååã§ãã
ä¾å¤
RangeError-
formãä¸è¨ã§æå®ãããå¤ã®ãããã§ããªãå ´åã«çºçãã¾ãã
解説
Unicode ã¯åã
ã®æåã«å¯¾ãã¦ããã³ã¼ããã¤ã³ããã¨å¼ã°ããåºæã®å¤ãå²ãå½ã¦ã¦ãã¾ããä¾ãã°ã "A" ã®ã³ã¼ããã¤ã³ã㯠U+0041 ãå²ãå½ã¦ããã¦ãã¾ããããããè¤æ°ã®ã³ã¼ããã¤ã³ãããã³ã¼ããã¤ã³ãã®ä¸é£ã®ä¸¦ã³ããåä¸ã®æ½è±¡æåã表ããã¨ãããã¾ãã â ä¾ãã°ã"ñ" ã®æåã¯ä»¥ä¸ã®ããããã§è¡¨ããã¨ãã§ãã¾ãã
- åä¸ã®ã³ã¼ããã¤ã³ã U+00F1
"n"ã®ã³ã¼ããã¤ã³ã (U+006E) ã«ç¶ãã¦çµã¿åãããã«ãã®ã³ã¼ããã¤ã³ã (U+0303)
const string1 = "\u00F1";
const string2 = "\u006E\u0303";
console.log(string1); // ñ
console.log(string2); // ñ
ããããã³ã¼ããã¤ã³ããç°ãªããããæååã®æ¯è¼ã§ã¯ããããåããã®ã¨ãã¦æ±ããã¾ãããã¾ããããããã®ã³ã¼ããã¤ã³ãã®æ°ãç°ãªããããé·ãããç°ãªãã¾ãã
const string1 = "\u00F1"; // ñ
const string2 = "\u006E\u0303"; // ñ
console.log(string1 === string2); // false
console.log(string1.length); // 1
console.log(string2.length); // 2
normalize() ã¡ã½ããã¯ãåãæåã表ãã³ã¼ããã¤ã³ãã®ãã¹ã¦ã®ä¸¦ã³ãå
±éã®æ£è¦åãããå½¢å¼ã«æååã夿ãããã¨ã§ããã®åé¡ã解決ããã®ã«å½¹ç«ã¡ã¾ããæ£è¦åã®æ¹æ³ã¯ä¸»ã« 2 ã¤ãããã1 ã¤ã¯æ£æºç価æ§ã«ããã 1 ã¤ã¯äºæç価æ§ã«åºã¥ãã¾ãã
æ£æºç価æ§ã«ããæ£è¦å
Unicode ã§ã¯ã2 ã¤ã®ã³ã¼ããã¤ã³ãã®ä¸¦ã³ãåãæ½è±¡æåã表ãã¦ããã°ãæ£æºç価æ§ãããã¨ããã常ã«åãå¤è¦è¡¨ç¤ºã¨åä½ãããã¹ãã§ãï¼ä¾ãã°ãä¸¦ã¹æ¿ãã§å¸¸ã«åããã®ã¨ãã¦æ±ãã¹ãã§ãï¼ã
normalize() ã "NFD" ã¾ã㯠"NFC" ã®å¼æ°ã§ä½¿ç¨ãããã¨ã§ããã¹ã¦ãæ£æºçä¾¡ãªæååã¨ãªãæååã®å½¢ãçæãããã¨ãã§ãã¾ãã以ä¸ã®ä¾ã§ã¯ãæå "ñ" ã®äºã¤ã®è¡¨ç¾ãæ£è¦åãã¦ãã¾ãã
let string1 = "\u00F1"; // ñ
let string2 = "\u006E\u0303"; // ñ
string1 = string1.normalize("NFD");
string2 = string2.normalize("NFD");
console.log(string1 === string2); // true
console.log(string1.length); // 2
console.log(string2.length); // 2
åæå½¢ã¨å解形
"NFD" ã§æ£è¦åãããå½¢ã®é·ãã 2 ã§ãããã¨ã«æ³¨æãã¦ãã ããã"NFD" ã¯åè§£æ£è¦å½¢ãçæããããã§ãããããã¯åä¸ã®ã³ã¼ããã¤ã³ããè¤æ°ã®ã³ã¼ããã¤ã³ãã®çµã¿åããã«åè§£ãã¾ãã "ñ" ã®åè§£æ£è¦å½¢ã¯ "\u006E\u0303" ã§ãã
"NFC" ãæå®ããã¨åææ£è¦å½¢ãåå¾ãããã¨ãã§ããããã¯è¤æ°ã®ã³ã¼ããã¤ã³ããå¯è½ãªéãåä¸ã®ã³ã¼ããã¤ã³ãã§ç½®ãæãã¾ãã "ñ" ã®åææ£è¦å½¢ã¯ "\u00F1" ã§ãã
let string1 = "\u00F1"; // ñ
let string2 = "\u006E\u0303"; // ñ
string1 = string1.normalize("NFC");
string2 = string2.normalize("NFC");
console.log(string1 === string2); // true
console.log(string1.length); // 1
console.log(string2.length); // 1
console.log(string2.codePointAt(0).toString(16)); // f1
äºææ£è¦å½¢
Unicode ã§ã¯ã2 ã¤ã®ã³ã¼ããã¤ã³ãã®ä¸¦ã³ããåãæ½è±¡æåã表ãå ´åã«äºææ§ããããå ´åã«ãã£ã¦ã¯åãæåã¨ãã¦æ±ãããã¹ãã§ããããã¹ã¦ã®ã¢ããªã±ã¼ã·ã§ã³ã§ããããã¹ãã¨ã¯éããªããã¨ãããã¾ãã
ãã¹ã¦ã®æ£æºç価ãªä¸¦ã³ã¯äºæã¨ããã¾ãããéã¯ããã¨ã¯ããã¾ããã
ä¾ãæãã¾ãã
- ã³ã¼ããã¤ã³ã U+FB00 ã¯åå
"ï¬"ã表ãã¾ãããã㯠2 ã¤ã®é£ç¶ããã³ã¼ããã¤ã³ã U+0066 ("ff") ã¨äºææ§ãããã¾ãã - ã³ã¼ããã¤ã³ã U+24B9 ã¯ãè¨å·
"â¹"ã表ãã¾ãã ãã㯠U+0044 ã®ã³ã¼ããã¤ã³ã ("D") ã¨äºææ§ãããã¾ãã
å ´é¢ã«ãã£ã¦ã¯ï¼ä¸¦ã¹æ¿ããªã©ï¼åããã®ã¨ãã¦ã¿ãªãããã¹ãã§ããããã®ä»ã®å ´åã¯ï¼å¤è¦ãªã©ï¼åãã¨ããã¹ãã§ã¯ãªãã®ã§ããããã¯å³å¯ã«ã¯çããããã¾ããã
normalize() ã "NFKD" ã¾ã㯠"NFKC" ã弿°ã«ãã¦ä½¿ç¨ãããã¨ã§ãäºæçä¾¡ãªæååãåãã«ãªãå½¢ã®æååãçæãããã¨ãã§ãã¾ãã
let string1 = "\uFB00";
let string2 = "\u0066\u0066";
console.log(string1); // ï¬
console.log(string2); // ff
console.log(string1 === string2); // false
console.log(string1.length); // 1
console.log(string2.length); // 2
string1 = string1.normalize("NFKD");
string2 = string2.normalize("NFKD");
console.log(string1); // ff <- å¤è¦ãå¤ãã£ã
console.log(string2); // ff
console.log(string1 === string2); // true
console.log(string1.length); // 2
console.log(string2.length); // 2
äºæçä¾¡ãªæ£è¦åãé©ç¨ããéã«ã¯ãæ£è¦åãããå½¢å¼ããã¹ã¦ã®ã¢ããªã±ã¼ã·ã§ã³ã«é©ãã¦ããã¨ã¯éããªãã®ã§ãæååã§ä½ããããã¨ãã¦ããã®ããèæ
®ãããã¨ãéè¦ã§ããä¸ã®ä¾ã§ã¯ãã¦ã¼ã¶ã¼ã "f" ãæ¤ç´¢ããã°æååãè¦ã¤ãããã¨ãã§ããã®ã§ãæ£è¦åã¯æ¤ç´¢ã«é©ãã¦ãã¾ããããããè¦è¦çãªè¡¨ç¾ãç°ãªãããã表示ã«ã¯é©åã§ã¯ãªãããããã¾ããã
æ£æºçä¾¡ãªæ£è¦åã®ããã«ãå解形ã¾ãã¯åæå½¢ã®äºæç価形å¼ã¯ããããã "NFKD" ã¾ã㯠"NFKC" ãæ¸¡ããã¨ã§åãåããããã¨ãã§ãã¾ãã
ä¾
>normalize() ã®ä½¿ç¨
// æåã®æåå
// U+1E9B: LATIN SMALL LETTER LONG S WITH DOT ABOVE
// U+0323: COMBINING DOT BELOW
const str = "\u1E9B\u0323";
// æ£æºåæå½¢ (NFC)
// U+1E9B: LATIN SMALL LETTER LONG S WITH DOT ABOVE
// U+0323: COMBINING DOT BELOW
str.normalize("NFC"); // '\u1E9B\u0323'
str.normalize(); // same as above
// æ£æºå解形 (NFD)
// U+017F: LATIN SMALL LETTER LONG S
// U+0323: COMBINING DOT BELOW
// U+0307: COMBINING DOT ABOVE
str.normalize("NFD"); // '\u017F\u0323\u0307'
// äºæåæå½¢ (NFKC)
// U+1E69: LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
str.normalize("NFKC"); // '\u1E69'
// äºæå解形 (NFKD)
// U+0073: LATIN SMALL LETTER S
// U+0323: COMBINING DOT BELOW
// U+0307: COMBINING DOT ABOVE
str.normalize("NFKD"); // '\u0073\u0323\u0307'
仿§æ¸
| Specification |
|---|
| ECMAScript® 2027 Language Specification> # sec-string.prototype.normalize> |
ãã©ã¦ã¶ã¼ã®äºææ§
é¢é£æ å ±
- Unicode Standard Annex #15, Unicode Normalization Forms
- Unicode ã®ç価æ§ï¼ã¦ã£ãããã£ã¢ï¼