% commonerrors.txt

last updated on 1/5/98

Here is something we should expand as
`common mistakes and establish correct ways' to type sanskrit.

There are two types of errors.  One is due to transliteration and other
due to wrong usage of Sanskrit terms or sandhiis.

We will use ITRANS as the transliteration scheme for explanation.
Those rules will be that of encoding and pronunciation.

One has to pay attention to some details, as one would do to write
Devanagari characters.  Editting the text later to
make it aa-kaar or ii-kaar is time consuming, more so when the one who 
is typing makes mistakes. 
 
---------------------- Transliteration -----------------------------------

Here is the scheme inserted for reference
***************************************
ITRANS 5.1 Encoding for Devanagari (Hindi/Marathi/Sanskrit) 

This section describes the ITRANS encoding, for Devanagari. This is the basic 
encoding used for all Indic language scripts. 

ITRANS 5.1 is completely compatible with the older ITRANS 4.04 release, 
so any documents encoded in ITRANS 4.04 will work correctly with ITRANS 5.1. 

Vowels (dependent and independent):
-------
a      aa / A    i      ii / I     u       uu / U 
R^i    R^I       L^i    L^I
e      ai        o      au         aM      aH

Consonants:
----------- 
k     kh     g     gh     ~N
ch    Ch     j     jh     ~n
T     Th     D     Dh     N
t     th     d     dh     n
p     ph     b     bh     m
y     r      l     v / w
sh    Sh     s     h      L
x / kSh     GY / j~n / dny     shr

R (for marathi half-RA)
L / ld (marathi LLA)

Specials/Accents:
-----------------
Anusvara:       .n / M (dot on top of previous consonant/vowel)
Avagraha:       .a    (`S' like symbol basically to replace a after o)
Ardhachandra:   .c    (for vowel sound as in english words `cat' or `talk')
Chandra-Bindu:  .N    (chandra-bindu on top of previous letter)
Halant:         .h    (to get half-form of the consonant - no vowel - virama)
Visarga:        H     (visarga - looks like a colon character)
Om:             OM, AUM (Om symbol)

A few new codes are now also accepted: 
	w (== v), 
	kSh (== ksh), 
	~N (== JN), 
	~n (== N^),
	dny (== GY), 
	^r (== .r == hindi-half-ra). 

Consonants with a nukta (dot) under them (mainly for Urdu devanagari):
-----------------------------------------
k  with a dot:      q
kh with a dot:      K
g  with a dot:      G
j  with a dot:      z
p  with a dot:      f
D  with a dot:      .D
Dh with a dot:      .Dh

*****************************

Transliteration specific corrections of common errors:

Use 

    aa or A instead of a for aa-kaar
    uu or U instead of u or oo for uu-kaar
    ii or I instead of i or ee for ii-kaar
    e instead of E or ay (Telugu influence)
    aM instead of am (word ending anusvaar)  More on this later in the end
    aH  instead of .h (visarga)
    .h is used for half letter like m.h, t.h
    aa_ii or aa{}ii to have two vowels together as in Hindi bhaa{}ii for brother
                    (_ may not work in some instances)
    
    
    ka instead of kha (Tamil and Kannada influence), ka and kha are different
    ga instead of gha (Tamil and Kannada influence), ga and gha are different
    cha instead of ca (Indology or other transliteration influence)
    chha instead of cha
    ta instead of tha (Tamil and Kannada influence), ta and tha are different
    da instead of dha (Tamil and Kannada influence),  da and dha are different 
    da instead of dha (Tamil and Kannada influence),  da and dha are different 
    va instead of ba  (Bengali influence) va and ba are distinctly different.
    shha instead of sha or sa, sha-shha-sa are three distinct.
    ksh  instead of kshh
    GYa (hindi influence) instead of jjna or jJNa or dnya (Marathi influence)

    Ta-Tha-Da-Dha-Na instead of ta-tha-da-dha-na

Watch for aaa, hh, nD, Nd combination.


----------------------- Sanskrit rules: -----------------------------------

To form conjunts with nasals, use

     N^k, N^kh, N^g, N^gh  		or ~Nk, ~Nkh, ~Ng, ~Ngh
    JNch, JNchh, JNj, JNjh		or ~nch, ~nchh, ~nj, ~njh
     NT, NTh, ND, NDh
     nt, nth, nd, ndh
     mp, mph, mb, mbh

    All the N^, JN, N, n, m can be replaced by .n(overdot), or the
    pa, pha, ba, bha series m with M, to keep the printout and
    pronunciation correct.  The overdot with M or .n is accepted way but is 
    technically incorrect, mostly from pronunciation standpoint.

To use M or .n for anusvaara
    If an anusvaara (overdot) is used within the words (word internal!)
    instead of above mentioned nasals, we suggest that you use
    .n instead of M for all the letters except p, ph, b, bh, m.
    With remaining letters, y, r, l, v, sh, shh, s, h, L, x, GY use .n.
    So it will be 
    sa.nskR^ita
    sa.nvaada
    sa.nlagna
    sa.nsaara
    a.nsha
    sa.nrakshaka
    sa.nyama     
    et ceteraa.  It is wrong to ma-kaar for anusvaara in these words.
    These .n have different pronunciation than simple M as saMsaara
    and is more like with ardhacha.ndrabi.nduu.

    This is not critical since the output with M and .n is same. The note
    is added more for clarification/information.  There is a very easy fix
    for such anusvaar in Unix editting,  with it
    M[kgcjTDtdyrlvshLG]  change to .n[[kgcjTDtdyrlvshLG]
    This affects each letter in square bracket which is encoded with M.
    In sed
    s/M\([kgcjTDtd]\)/\.n\1/
    s/M\([yrlvshLG]\)/\.n\1/  will be useful.  M[pbm] stay the same!

As an observation T, Th, D, Dh, N are always prefixed by
shh, so dveshhTi !  It is never (special cases?) shht  or shhd or shhn
or shT  shD  shN (except perhaps in Hindi)

Please use .n followed by y, r, l, v, s, h instead of M for internal
anusvaar.  This to avoid `ma' pronunciation with these letters.

A word ending anusvaar with M followed by vowel becomes makaar

  (word)M and  a,aa,i,ii,u,uu,e,ai,o,au as a start of the following word
      become, respectively,

    ma, maa, mii, mu, muu, me, mai, mo, mau .
  
    As an example, kiM aasiita becomes kimaasiita, 
                   ashvatthaM enaM becomes ashvatthamenaM .

The word ending 
   k should be k.h    as vaak.h and not vaak
   m should be m.h    as suresham.h and not suresham 
   n should be n.h    as raajan.h and not raajan
   t should be t.h    as dhyaayet.h and not dhyaayet

    and similar for ga, cha, Ta, et cetera.
   The newer ITRANS version 5.0 onwards accomodates word ending consonents
   and automatically adds hala.nta (.h) to them.

Rules for visarga (H) ending word,

   Most of the visarga-s become sh, sa, or shha depending on the first letter
   of the following word.

     H shhaT.h  becomes  shhshhaT.h 
   kaH chit.h  becomes kashchit.h
   vaaN^mayaH tvaM    becomes vaaN^mayastvaM

Rules for avagraha

   The vowel ending words when joined with a or aa-kaar words
   an avagraha .a is put for a-kaar, two avagraha-s .a.a are used for aa-kaar
   The first vowel may or may not change during this joining.

   praNata asmi       praNato.asmi
navama adhyaayaH      navamo.adhyaayaH
   loke asmin.h       loke.asmin.h
tathaa aatmani        tathaa.a.atmani

Other sandhii

   tat.h dhaama       taddhaama

Use

    sattva instead of satva
    tattva instead of tatva


-------------------------------------------------------------------------
This can be programmed to identify typing errors, 
automatically, if such extraction can be programmed.


We should standardize these rules as much as possible.  
Of course, we cannot add all the grammar rules of Panini, but this file can 
assist us.

Please provide your additions or corrections!


More notes on anusvaar and Nasals:

Nasals are more appropriate (correct?) to use in Sanskrit.
They at times produce clumsy display of characters in which case
        anusvaar is a better symbol and widely accepted by printers all over.
Three ways of giving anusvaar .n M .m in ITRANS 5.1
        are for users' convenience and for conveying proper pronunciations
        with corresponding nasals.
There is also a vedic anusvaar which is ardhacha.ndrabi.nduu with
        viraama or hala.nta underneath and is pronounced as `un' sound in
        bounce, pronounce, sound without `u' but `a' vowel in it.
        Please see svaramanj.itx/.ps for more details.  For ordinary use,
        {\m+} can be used to indicate vedic anusvaar.
Vedic anusvaar is more suitable with nasal sound for y, r, l, v, sh, shh, s.
Use M for word ending anusvaar and when the nasal which it replaces is m
        (e.g. for p, ph, b, bh, m  as in saMpadaa, saMbhaashhaN)
        If you want to be more perfect you can deviate from word ending M
        by replacing it with .n when the following words do not follow
        with p, ph, b, bh, m letters---> this is may be little confusing
        but we will deal with it later as you practice it.
Use .n for rest of the letters including y, r, l, v, sh, s, shh, h, L, kSh,
             GY or j~n.
If you have to use .m use it for replacing M.
Sentence ending anusvaar (m sound) is commonly replaced by m.h which
         follows da.nDa | for line or sentence ending.

% End of common_errors.txt