Virama: Difference between revisions

्
्
Virama

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Inline

Latest revision as of 18:20, 6 April 2024

Virama (Sanskrit: विराम/हलन्त, romanized: virāma/halanta ्) is a Sanskrit phonological concept to suppress the inherent vowel that otherwise occurs with every consonant letter, commonly used as a generic term for a codepoint in Unicode, representing either

halanta, hasanta or explicit virāma, a diacritic in many Brahmic scripts, including the Devanagari and Bengali scripts, or
saṃyuktākṣara (Sanskrit: संयुक्ताक्षर) or implicit virama, a conjunct consonant or ligature.

Unicode schemes of scripts writing Mainland Southeast Asia languages, such as that of Burmese script and of Tibetan script, generally do not group the two functions together.

Names[edit]

The name is Sanskrit for "cessation, termination, end". As a Sanskrit word, it is used in place of several language-specific terms, such as:

Name in English books	Language	In native language	Form	Notes
halant	Hindi	हलन्त, halant	्
halanta	Punjabi	ਹਲੰਤ, halanta	੍
	Marathi	हलन्त, halanta	्
	Nepali	हलन्त, halanta	्
	Kannada	ಹಲಂತ, halanta	್
	Odia	ହଳନ୍ତ, hôḷôntô	୍
	Gujarati	હાલાંત, hālānta	્
hosonto	Bengali	হসন্ত, hôsôntô	্
	Assamese	হসন্ত, hoxonto / হছন্ত, hosonto	্
	Sylheti	ꠢꠡꠘ꠆ꠔꠧ, hośonto	◌ ꠆
pollu	Telugu	పొల్లు, pollu	్
pulli	Tamil	புள்ளி, puḷḷi	்
chandrakkala	Malayalam	ചന്ദ്രക്കല, candrakkala / വിരാമം, viraamam	്
hal kirima	Sinhalese	හල් කිරිම, hal kirīma	්
a that	Burmese	အသတ်, a.sat, IPA: [ʔa̰θaʔ]	်	lit. "nonexistence"
viream	Khmer	វិរាម, vīrāma	៑
toandokheat	Khmer	ទណ្ឌឃាត, toandokheat	៍
karan, thanthakhat	Thai	การันต์, kārạnt^[1]^[2] / ทัณฑฆาต, thanthakhat^[3]^[4]	◌์	Thanthakhat is the name of the diacritic, while karan refers to the character that was marked. These two terms are often used interchangeably. It is used to mark as silent vowels or consonants that were originally pronounced, but have become silenced in Thai pronunciation (mostly from Sanskrit and Old Khmer). This diacritic is sometimes used in loanwords from European languages to mark final consonants in consonant clusters (e.g. want as วอนท์).
pinthu		พินทุ, pinthu	◌ฺ	Pinthu is akin to Sanskrit bindu, and means "point" or "dot". It is used to mark a syllable as closed, and it is only used in Thai script when writing Pali or Sanskrit.
nikkhahit		นฤคหิต / นิคหิต	◌ํ	Nikkhahit represents what was originally anusvāra in Sanskrit. Like pinthu, it is also only used when writing Pali or Sanskrit in Thai script. It marks a syllable as nasalized, realized in Thai as a nasal closed consonant following the vowel.
rahaam	Northern Thai (Lanna)	ᩁᩉ᩶ᩣ᩠ᨾ, rahaam^[5]	◌᩺
	Tai Khün		◌᩼
	Tai Lue		◌᩼
pangkon	Javanese	ꦥꦁꦏꦺꦴꦤ꧀, pangkon	◌꧀
pangkon	Balinese	ᬧᬂᬓᭀᬦ᭄, pangkon	◌᭄	Also called adeg-adeg
sukun	Dhivehi	Dhivehi: ސުކުން, sukun	ް◌	Derives from Arabic "sukun"
Srog med	Tibetan	Srog med	྄	Only used when transcribing Sanskrit

Usage[edit]

In Devanagari and many other Indic scripts, a virama is used to cancel the inherent vowel of a consonant letter and represent a consonant without a vowel, a "dead" consonant. For example, in Devanagari,

क is a consonant letter, ka,
् is a virāma; therefore,
क् (ka + virāma) represents a dead consonant k.

If this k क् is further followed by another consonant letter, for example, ṣa ष, the result might look like क्‌ष, which represents kṣa as ka + (visible) virāma + ṣa. In this case, two elements k क् and ṣa ष are simply placed one by one, side by side. Alternatively, kṣa can be also written as a ligature क्ष, which is actually the preferred form. Generally, when a dead consonant letter C₁ and another consonant letter C₂ are conjoined, the result may be:

A fully conjoined ligature of C₁+C₂;
Half-conjoined—
- C₁-conjoining: a modified form (half form) of C₁ attached to the original form (full form) of C₂
- C₂-conjoining: a modified form of C₂ attached to the full form of C₁; or
Non-ligated: full forms of C₁ and C₂ with a visible virama.^[6]

If the result is fully or half-conjoined, the (conceptual) virama which made C₁ dead becomes invisible, logically existing only in a character encoding scheme such as ISCII or Unicode. If the result is not ligated, a virama is visible, attached to C₁, actually written.

Basically, those differences are only glyph variants, and the three forms are semantically identical. Although there may be a preferred form for a given consonant cluster in each language and some scripts do not have some kind of ligatures or half forms at all, it is generally acceptable to use a nonligature form instead of a ligature form even when the latter is preferred if the font does not have a glyph for the ligature. In some other cases, whether to use a ligature or not is just a matter of taste.

The virāma in the sequence C₁ + virāma + C₂ may thus work as an invisible control character to ligate C₁ and C₂ in Unicode. For example,

ka क + virāma + ṣa ष = kṣa क्ष

is a fully conjoined ligature. It is also possible that the virāma does not ligate C₁ and C₂, leaving the full forms of C₁ and C₂ as they are:

ka क + virama + ṣa ष = kṣa क्‌ष

is an example of such a non-ligated form.

The sequences ङ्क ङ्ख ङ्ग ङ्घ [ṅka ṅkha ṅɡa ṅɡha], in common Sanskrit orthography, should be written as conjuncts (the virāma and the top cross line of the second letter disappear, and what is left of the second letter is written under the ङ and joined to it).

End of word[edit]

The inherent vowel is not always pronounced, in particular at the end of a word (schwa deletion). No virāma is used for vowel suppression in such cases. Instead, the orthography is based on Sanskrit where all inherent vowels are pronounced, and leaves to the reader of modern languages to delete the schwa when appropriate.^[7]

References[edit]

^ "คำศัพท์ การันต์ แปลว่าอะไร?". Longdo Dict.
^ th:การันต์
^ "คำศัพท์ ทัณฑฆาต แปลว่าอะไร?". Longdo Dict.
^ th:ทัณฑฆาต
^ "Tai Tham" (PDF). The Unicode Standard. Retrieved 30 July 2022.
^ Constable, Peter (2004). "Clarification of the Use of Zero Width Joiner in Indic Scripts" (PDF). Public Review Issue #37. Unicode, Inc. Retrieved 2009-11-19. {{cite web}}: External link in |work= (help)
^ Akira Nakanishi: Writing Systems of the World, ISBN 0-8048-1654-9, pp. 48.

External links[edit]

Blog: Sorting it all Out

[:0-1] "คำศัพท์ การันต์ แปลว่าอะไร?". Longdo Dict.

[:1-2] th:การันต์

[:2-3] "คำศัพท์ ทัณฑฆาต แปลว่าอะไร?". Longdo Dict.

[:3-4] th:ทัณฑฆาต

[5] "Tai Tham" (PDF). The Unicode Standard. Retrieved 30 July 2022.

[6] Constable, Peter (2004). "Clarification of the Use of Zero Width Joiner in Indic Scripts" (PDF). Public Review Issue #37. Unicode, Inc. Retrieved 2009-11-19. {{cite web}}: External link in |work= (help)

[7] Akira Nakanishi: Writing Systems of the World, ISBN 0-8048-1654-9, pp. 48.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 4: / Line 4: @@
 |unicode=}}
 '''Virama''' ({{Lang-sa|विराम/हलन्त|translit=virāma/halanta}} ्) is a [[Sanskrit]] phonological concept to suppress the [[inherent vowel]] that otherwise occurs with every consonant letter, commonly used as a generic term for a codepoint in Unicode, representing either
-# '''halanta''' or explicit '''virāma''', a [[diacritic]] in many [[Brahmic scripts]], including the [[Devanagari]] and [[Eastern Nagari]] scripts, or
+# '''halanta''', '''hasanta''' or explicit '''virāma''', a [[diacritic]] in many [[Brahmic scripts]], including the [[Devanagari]] and [[Bengali–Assamese script|Bengali]] scripts, or
 # '''saṃyuktākṣara''' ([[Sanskrit]]: संयुक्ताक्षर) or implicit virama, a conjunct consonant or ligature.
-Unicode schemes of scripts writing [[Mainland Southeast Asia linguistic area|Mainland Southeast Asia languages]], such as that of [[Burmese script]] and of [[Tibetan script]], generally don't group the two functions together.
+Unicode schemes of scripts writing [[Mainland Southeast Asia linguistic area|Mainland Southeast Asia languages]], such as that of [[Burmese script]] and of [[Tibetan script]], generally do not group the two functions together.
 == Names ==
@@ Line 30: / Line 30: @@
 |-
 |[[Marathi language|Marathi]]
-|{{indic|lang=mr|indic=हलंत|trans=halanta|defaultipa=|showlang=false}}
+|{{indic|lang=mr|indic=हलन्त|trans=halanta|defaultipa=|showlang=false}}
 |्
 |
@@ Line 66: / Line 66: @@
 |-
 |[[Sylheti language|Sylheti]]
-|{{Indic|lang=syl|indic=ꠢꠡꠘ꠆ꠔꠧ|trans=ośonto|defaultipa=|showlang=false}}
+|{{Indic|lang=syl|indic=ꠢꠡꠘ꠆ꠔꠧ|trans=hośonto|defaultipa=|showlang=false}}
 |<span style="font-family: 'Surma';"> ◌ ꠆</span>
 |
@@ Line 115: / Line 115: @@
 |{{Indic|lang=th|indic=การันต์|trans=kārạnt|defaultipa=|showlang=false|showhelp=false}}<ref name=":0">{{cite web|title=คำศัพท์ ''การันต์'' แปลว่าอะไร?|url=http://dict.longdo.com/search/%E0%B8%81%E0%B8%B2%E0%B8%A3%E0%B8%B1%E0%B8%99%E0%B8%95%E0%B9%8C|website=Longdo Dict}}</ref><ref name=":1">[[:th:การันต์]]</ref> / {{Indic|lang=th|indic=ทัณฑฆาต|trans=thanthakhat|defaultipa=|showlang=false}}<ref name=":2">{{cite web|title=คำศัพท์ ''ทัณฑฆาต'' แปลว่าอะไร?|url=http://dict.longdo.com/search/%E0%B8%97%E0%B8%B1%E0%B8%93%E0%B8%91%E0%B8%86%E0%B8%B2%E0%B8%95|website=Longdo Dict}}</ref><ref name=":3">[[:th:ทัณฑฆาต]]</ref>
 |◌์
-|While thanthakhat is the name of symbol and karan is the character that was marked. In modern day it being used interchangeably. It was being used to mark a silent character that was pronounced in original language but become silenced in Thai pronunciation (mostly from sanskrit and old khmer). Modern usage also used as a marked for intonation and last consonant of European language
+|''Thanthakhat'' is the name of the diacritic, while ''karan'' refers to the character that was marked. These two terms are often used interchangeably. It is used to mark as silent vowels or consonants that were originally pronounced, but have become silenced in Thai pronunciation (mostly from Sanskrit and [[Old Khmer]]). This diacritic is sometimes used in loanwords from European languages to mark final consonants in consonant clusters (e.g. want as วอนท์).
 |-
 |''pinthu''
 |{{Indic|lang=th|indic=พินทุ|trans=pinthu|defaultipa=|showlang=false|showhelp=false}}
 |◌ฺ
-|pinthu is akin to Sanskrit [[Bindu (symbol)|bindu]], and means "point" or "dot". Use to mark a closed syllable in sanskrit
+|''Pinthu'' is akin to Sanskrit [[Bindu (symbol)|bindu]], and means "point" or "dot". It is used to mark a syllable as closed, and it is only used in Thai script when writing Pali or Sanskrit.
 |-
 |''nikkhahit''
 |นฤคหิต / นิคหิต
 |◌ํ
-|nikkhahit is originally [[anusvāra]] in Sanskrit. To mark an end of last or single syllable with nasal closed consonant
+|''Nikkhahit'' represents what was originally [[anusvāra]] in Sanskrit. Like ''pinthu'', it is also only used when writing Pali or Sanskrit in Thai script. It marks a syllable as nasalized, realized in Thai as a nasal closed consonant following the vowel.
+|-
+| rowspan="3" |''rahaam''
+|[[Northern Thai language|Northern Thai (Lanna)]]
+| rowspan="3" |{{Indic|lang=nod|indic=ᩁᩉ᩶ᩣ᩠ᨾ|trans=rahaam|defaultipa=|showlang=false}}<ref>{{Cite web |title=Tai Tham |url=https://www.unicode.org/charts/PDF/U1A20.pdf |access-date=30 July 2022 |website=The Unicode Standard}}</ref>
+|◌᩺
+|
+|-
+|[[Khün language|Tai Khün]]
+|◌᩼
+|
+|-
+|[[Tai Lue language|Tai Lue]]
+|◌᩼
+|
 |-
 | rowspan="2" |''pangkon''
@@ Line 154: / Line 168: @@
 In [[Devanagari]] and many other [[Brahmic family of scripts|Indic scripts]], a virama is used to cancel the [[inherent vowel]] of a consonant letter and represent a consonant without a vowel, a "dead" consonant. For example, in Devanagari,
 #{{lang|sa|क}} is a consonant letter, ''ka'',
-#् is a virama; therefore,
+#् is a virāma; therefore,
-#{{lang|sa|क्}} (''ka'' + virama) represents a dead consonant ''k''.
+#{{lang|sa|क्}} (''ka'' + virāma) represents a dead consonant ''k''.
-If this ''k'' {{lang|sa|क्}} is further followed by another consonant letter, for example, ''ṣa'' {{lang|sa|ष}}, the result might look like {{lang|sa|क्‌ष}}, which represents ''kṣa'' as ''ka'' + (visible) virama + ''ṣa''. In this case, two elements ''k'' {{lang|sa|क्}} and ''ṣa'' {{lang|sa|ष}} are simply placed one by one, side by side. Alternatively, ''kṣa'' can be also written as a [[Typographic ligature|ligature]] {{lang|sa|क्ष}}, which is actually the preferred form.
+If this ''k'' {{lang|sa|क्}} is further followed by another consonant letter, for example, ṣa ष, the result might look like {{lang|sa|क्‌ष}}, which represents ''kṣa'' as ''ka'' + (visible) virāma + ''ṣa''. In this case, two elements ''k'' क् and ''ṣa'' ष are simply placed one by one, side by side. Alternatively, ''kṣa'' can be also written as a [[Typographic ligature|ligature]] {{lang|sa|क्ष}}, which is actually the preferred form.
 Generally, when a dead consonant letter C<sub>1</sub> and another consonant letter C<sub>2</sub> are conjoined, the result may be:
 #A fully conjoined ligature of C<sub>1</sub>+C<sub>2</sub>;
@@ Line 166: / Line 180: @@
 If the result is fully or half-conjoined, the (conceptual) virama which made C<sub>1</sub> dead becomes invisible, logically existing only in a [[character encoding]] scheme such as [[Indian Script Code for Information Interchange|ISCII]] or [[Unicode]]. If the result is not ligated, a virama is visible, attached to C<sub>1</sub>, actually written.
-Basically, those differences are only glyph variants, and three forms are [[semantics|semantically]] identical. Although there may be a preferred form for a given consonant cluster in each language and some scripts do not have some kind of ligatures or half forms at all, it is generally acceptable to use a nonligature form instead of a ligature form even when the latter is preferred if the font does not have a glyph for the ligature. In some other cases, whether to use a ligature or not is just a matter of taste.
+Basically, those differences are only glyph variants, and the three forms are [[semantics|semantically]] identical. Although there may be a preferred form for a given consonant cluster in each language and some scripts do not have some kind of ligatures or half forms at all, it is generally acceptable to use a nonligature form instead of a ligature form even when the latter is preferred if the font does not have a glyph for the ligature. In some other cases, whether to use a ligature or not is just a matter of taste.
-The virama in the sequence C<sub>1</sub> + virama + C<sub>2</sub> may thus work as an invisible control character to ligate C<sub>1</sub> and C<sub>2</sub> in Unicode. For example,
+The virāma in the sequence C<sub>1</sub> + virāma + C<sub>2</sub> may thus work as an invisible control character to ligate C<sub>1</sub> and C<sub>2</sub> in Unicode. For example,
-*''ka'' {{lang|sa|क}} + virama + ''ṣa'' {{lang|sa|ष}} = ''kṣa'' {{lang|sa|क्ष}}
+*''ka'' क + virāma + ṣa ष = ''kṣa'' {{lang|sa|क्ष}}
-is a fully conjoined ligature. It is also possible that the virama does not ligate C<sub>1</sub> and C<sub>2</sub>, leaving the full forms of C<sub>1</sub> and C<sub>2</sub> as they are:
+is a fully conjoined ligature. It is also possible that the virāma does not ligate C<sub>1</sub> and C<sub>2</sub>, leaving the full forms of C<sub>1</sub> and C<sub>2</sub> as they are:
 *''ka'' {{lang|sa|क}} + virama + ''ṣa'' {{lang|sa|ष}} = ''kṣa'' {{lang|sa|क्‌ष}}
 is an example of such a non-ligated form.
-The sequences ङ्क ङ्ख ङ्ग ङ्घ {{IPA|[ŋka ŋkʰa ŋɡa ŋɡʱa]}}, in common Sanskrit orthography, should be written as conjuncts (the virama and the top cross line of the second letter disappear, and what is left of the second letter is written under the ङ and joined to it).
+The sequences ङ्क ङ्ख ङ्ग ङ्घ {{IPA|[ṅka ṅkha ṅɡa ṅɡha]}}, in common Sanskrit orthography, should be written as conjuncts (the virāma and the top cross line of the second letter disappear, and what is left of the second letter is written under the ङ and joined to it).
 == End of word ==
-The [[inherent vowel]] is not always pronounced, in particular at the end of a word ([[Schwa deletion in Indo-Aryan languages|schwa deletion]]). No virama is used for vowel suppression in such cases. Instead, the orthography is based on Sanskrit where all inherent vowels are pronounced, and leaves to the reader of modern languages to delete the schwa when appropriate.<ref>Akira Nakanishi: Writing Systems of the World, {{ISBN|0-8048-1654-9}}, pp. 48.</ref>
+The [[inherent vowel]] is not always pronounced, in particular at the end of a word ([[Schwa deletion in Indo-Aryan languages|schwa deletion]]). No virāma is used for vowel suppression in such cases. Instead, the orthography is based on Sanskrit where all inherent vowels are pronounced, and leaves to the reader of modern languages to delete the schwa when appropriate.<ref>Akira Nakanishi: Writing Systems of the World, {{ISBN|0-8048-1654-9}}, pp. 48.</ref>
 ==See also==

v t e Diacritics
In Latin, Cyrillic and Greek	◌́ ◌̋ acute, double acute ◌᷄ apex ◌̆ ◌̑ breve, inverted breve ◌̌ caron, háček ◌̧ cedilla ◌̂ circumflex ◌̈ diaeresis, umlaut, other ◌̇ ◌̣ dot ◌̀ ◌̏ grave, double grave ◌̉ hook above ◌̡ ◌̢ palatal hook, retroflex hook ◌̛ horn ◌ͅ iota subscript ◌̄ macron ◌̨ ogonek, nosinė ◌̊ ◌̥ overring, underring ◌͂ perispomene ◌͗ sicilicus ◌̃ tilde ◌῾ ◌᾿ rough breathing, smooth breathing
In Early Cyrillic	◌҄ kamora ◌҇ pokrytie ◌҃ titlo
In Indic	ं ং ଂ ം anusvara ऽ ঽ ଽ ఽ ഽ ྅ avagraha ँ ఁ ྃ chandrabindu ़ nuqta ् ് ్ ್ ් ် virama ः ঃ ଃ ஃ visarga
In other scripts	Arabic diacritics Greek diacritics Gurmukhī diacritics Hebrew diacritics diacritics in IPA Japanese kana diacritics ◌゙ ◌゚ dakuten, handakuten Khmer diacritics Syriac diacritics Thai diacritics
Marks used as diacritics	◌’ apostrophe ◌̸ bar ◌: colon ◌, comma ◌. full stop/period ◌˗ hyphen ◌′ prime
Non-diacritic uses	^ caret (computing) ° degree symbol ~ tilde § Mathematics
In Unicode	◌ dotted circle (placeholder glyph character) combining character § Unicode ranges
See also: English terms with diacritical marks Metal umlaut Punctuation marks Category: Diacritics