1. Introduction 1.1. Acknowledgements 1.2. Coverage and Treatment 1.3. Transcription 2. Overview of Verbs 2.1. Roots 2.2. Derived Forms 2.3. Grammatical Categories 2.3.1. Voice, Aspect and Mood 2.3.2. Person, Number and Gender 2.4. Citation Forms 3. Morphological Resources 3.1. Stems 3.1.1. Vowel Patterns 3.1.2. Creating the Derived Forms 3.1.3. Form VIII Assimilations 3.1.4. Form-markers 3.2. Final Forms 3.2.1. Personal Prefixes 3.2.2. Personal Endings 3.2.3. Facilitative Vowel 4. Sound and Unsound Verbs 4.1. Geminates 4.1.1. Normal Geminates 4.1.2. Special Geminates 4.2. Weak Verbs 4.2.1. First-weak Verbs 4.2.2. Second-weak Verbs 4.2.3. Third-weak Verbs 4.2.3. Doubly Weak Verbs 4.3. Unsoundness in Derived Forms 5. Phonological Changes 6. Anomalies and Alternative Forms 6.1. Passives 6.2. Hamza 6.2.1. Irregularities 6.3. Geminate Jussives and Imperatives 6.4. Irregular Forms
I should like to express my thanks to Prof. Michael Carter, University of Sydney, for reviewing this account and clarifying the status of some verb-forms.
The Language Engine deals with Modern Standard Arabic, which as far as the forms of the verbs are concerned is the same as Classical Arabic. Modern Standard Arabic is used in formal and public situations (such as the media and politics). It is different from the various forms of colloquial Arabic, which vary from region to region and are used in informal and family situations.
It is assumed that the student knows the letters and sounds of Arabic. The transcription used is based on SAMPA, a modification of the International Phonetic Alphabet which, being designed specifically for computers, uses a limited character-set that will be accurately interpreted by all machines. As follows:
No character requires a special key-combination. Technically speaking the transcription is not strictly phonetic in that both uu and uw represent long u and both ii and ij represent long i.
Most Arabic verbs are based on a root consisting of three consonants (the triconsonantal or triliteral root): k-t-b write, H-r-f know. A few are based on a root of four consonants: t-r-dZ-m translate. The latter are known as quadriliterals, meaning "four letters" (noted in the Language Engine as Quad). Most commentators refer to the consonants in the root as radicals.
Arabic has a well-developed system for creating new words from a given triliteral or quadriliteral root; the meaning of the new word is related to the old. H-l-m, for example, means know, and H-ll-m (with the second consonant doubled) means cause to know, teach. The new words are known as "derived Forms" of the original word, "Form" being spelt with a capital F. The Forms are named with Roman numerals. The basic Form is known as Form I; H-l-m above is Form I, and H-ll-m is Form II. Triliteral roots have Forms I to X (Forms XI to XV are rare and are not dealt with in the Language Engine); the only important Forms for quadriliteral roots are Forms I, II and IV. A small complication is that triliteral and quadriliteral Forms with the same structure and behaviour do not have the same name: quadriliteral Form I is structured and behaves like triliteral Form II, quadriliteral II matches triliteral V and quadriliteral IV matches triliteral IX.
Since the procedures for creating derived Forms are procedures for creating new words, rather than procedures for creating different grammatical forms of a given word, they do not strictly speaking belong to morphology, but the Language Engine deals with them nonetheless. It has to be said that not all derived Forms of all verbs exist, and the suggestion that each Form has its own meaning (e.g. that Form II is always causative) is contentious. What is not contentious is that there is an established procedure for creating each Form, and that each Form is a new word of slightly different meaning from the Form I form.
Each verb, of whatever Form, changes to show voice, aspect, mood, person and gender. The set of grammatical categories, and the procedures for embodying them, are the same for all Forms.
There are two voices, active and passive; the set of aspects and moods is the same for each, except that there are no passive imperatives. There are two aspects: perfective, used broadly speaking for actions that are complete, and imperfective, used for uncompleted actions. (Some commentators call the perfective the "past" and the imperfective the "present".) The perfective aspect has only one mood, the indicative, whereas the imperfective aspect has four: indicative, subjunctive, jussive and imperative. The use of these moods is broadly as in more familiar European languages. There are no tenses - time-relationships, when it is necessary to mark them, are shown by separate particles (not dealt with in the Language Engine). Using a triliteral Form I verb as an example:
------------Active------------ -----------------Passive----------------- Perf'ive Imperfective Perfective Imperfective -------- -------------------- -------------- ------------------------- Indic. kataba jaktubu kutiba juktabu he wrote he is writing it was written it is being written Subj. - jaktuba - juktaba he might be writing it might be being written Juss. - jaktub - juktab he should be writing it should be being written Imper. - uktub! - - write!
In the Language Engine, subjunctives are translated with might and jussives with should in English. This is a rough-and-ready solution, serving mainly as a reminder that a form other than the indicative is called for. The English translation of an Arabic subjunctive or jussive depends very much on context, there being no single version that will serve in all cases. Similarly the perfective is always given in English as the simple past (wrote) and the imperfective indicative as the present continuous (is writing), though these translations will not be appropriate in all contexts.
For person and number, Arabic verbs show 13 forms instead of the usual 6 of more familiar European languages. As well as singular and plural there is a dual form (used to refer to two persons), and some persons of the verb mark masculine or feminine gender. There are thus potentially 18 forms (3 persons x 3 numbers x 2 genders); but not all slots are filled and there are in fact only 13 forms. Using the imperfective indicative as an example:
3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st any Sing. jaktubu taktubu taktubu taktubiina ?aktubu Dual jaktubaani taktubaani taktubaani - - Plur. jaktubuuna jaktubna taktubuuna taktubna naktubu
It will be seen that there is no distinction of gender in the 1st person, so removing 3 possible forms; no dual form for the 1st person (the plural is used instead), removing 1 other possible form; and no dual 2nd person fem. (the dual masc. is used instead), removing 1 more form.
Arabic verbs have no infinitive form, so the traditional citation form (the form used to list the word in dictionaries) is the perf. indic. 3rd masc. (kataba). From this derives the practice of listing the persons in the order 3rd - 2nd - 1st, and the Language Engine follows this tradition. It does not however follow tradition in using the perf. indic. 3rd masc. as the citation form, but uses instead the bare consonantal root, with hyphens marking the slots for vowels.
It is evident that with 10 derived Forms, two voices, two aspects, 5 moods and 13 persons the Arabic verb shows great morphological variety: jazid, zidna and tatazaajaduuna for example all come from the triliteral root z-j-d. The creation of the final form of the verb is best thought of as a two-stage process: first a stem is created, mainly by adding vowels to the root, and then prefixes and suffixes are added to the stem. These processes are all carried out in a systematic way, with the result that while Arabic verbs may seem complicated, they are almost never irregular. Some additional complications arise with so-called unsound verbs, but many of these can be reduced to phonological rules. There are also a few phonological changes that apply to any verb.
Form I stems are made by adding vowels to the consonantal root; the stems of other Forms are made by adding vowels to the consonantal root, doubling a vowel or a consonant, and in some cases by adding a form-marker consisting of a prefix or infix. Each Form has four stems: active perfective and imperfective, and passive perfective and imperfective. The description which follows ignores anomalies arising from unsound verbs, which are dealt with separately later.
Vowels are normally added to the root in the pattern CvCvC for triliterals and CvCCvC for quadriliterals. (C marks any consonant, v any vowel.) The vowels are changed or omitted to express voice and aspect: katab active perfective, ktub active imperfective, kutib passive perfective, ktab passive imperfective. Since each stem has two vowel-slots, we can describe the vowelling system in terms of four possible "first vowel - second vowel" pairs, one for each of the four stems.
These are the vowel-pairs for the ten triliteral and 3 quadriliteral Forms:
---------Active--------- ---------Passive-------- Perfective Imperfective Perfective Imperfective Triliteral: I a-L 0-C u-i 0-a II a-a a-i u-i a-a III a-a a-i u-i a-a IV 0-a 0-i 0-i 0-a V a-a a-a u-i a-a VI a-a a-a u-i a-a VII a-a a-i u-i a-a VIII a-a a-i u-i a-a IX a-a a-i - - X 0-a 0-i 0-i 0-a Quadriliteral: I a-a a-i u-i a-a II a-a a-a u-i a-a IV a-a a-i u-i a-a
L (under Perfective, triliteral Form I) means that the vowel is lexically determined, i.e. it must be looked up in the dictionary for the specific verb. C (triliteral Form I Imperfective) means that the vowel for the specific verb can be calculated from its perfective vowel (though in some cases dictionary information is also needed). 0 means that the vowel is omitted. There are no Form IX passives. The following patterns can be deduced:
The procedure for calculating the triliteral Form I active imperfective second vowel from the active perfective second vowel is as follows:
The above account is based on the stem-pattern CvCvC or CvCCvC, so the second vowel means the second of the two possible vowels. It is still called the second vowel even where the first vowel is omitted. In some procedures, a consonant and vowel can be swapped (dull from dlul, swapping the l and the u), but the u is still the original second vowel.
Once the vowels have been added, the stems of the derived Forms are made as shown below. Note that the vowel of the form-marker can vary between a and u according to rules given later, but it is still the same form-marker. The stems are given in the order active perfective, active imperfective, passive perfective, passive imperfective:
The assimilations that take place between the infixed t of Form VIII and its preceding consonant are shown below. The rule for assimilating the glottal plosive is not universal - some verbs (marked in the dictionary) preserve the sequence ?t. The exact form of the assimilations may moreover vary - zz or dd may be used instead of zd, for example - but those given are not considered wrong.
Emphatics VoicedDentals --------- ------------- s~t becomes s~t~ dt becomes dd d~t . . . . d~t~ Dt . . . . DD t~t . . . . t~t~ zt . . . . zd D~t . . . . D~D~ Semivowels Hamza ---------- ----- wt becomes tt ?t becomes tt jt becomes tt
Form-markers that end in a (i.e. ta, ?a and sta) change the a to u in the passive perfective. For cases where the change to u creates the illicit string uj, see under "Phonological Changes".
The resources used to convert the stem of the verb into its final form are to add a personal prefix, to add a suffix showing mood, and to add a facilitative vowel. The vowel of the personal prefix and the facilitative vowel, like the vowel of a form-marker, will change to reflect the grammatical category.
Personal prefixes are not used in the perfective, nor in the imperfective imperative. However, the same set is used for all three remaining moods (indicative, subjunctive and jussive) in all Forms. The vowel of the prefix varies between u and a according to voice, aspect and Form; in this table (Form I active) the vowel is a:
Active imperfective indicative 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. ja-ktubu ta-ktubu ta-ktubu ta-ktubiina ?a-ktubu Dual ja-ktubaani ta-ktubaani ta-ktubaani Plur. ja-ktubuuna ja-ktubna ta-ktubuuna ta-ktubna na-ktubu
There are four different prefixes in the set: ja, ta, ?a and na. ?a and na are unique as 1st sing. and plur. respectively, and the 2nd person always has ta. It is easiest to think of ja as the standard 3rd-person prefix, with ta being an exception in the 3rd fem. sing. and dual.
The personal prefix is added to the stem in front of any form-marker: in jatatabbaHu, for example, ja is the personal prefix, the first ta is the form-marker, and the second ta is part of the stem.
The general rule for vowels in personal prefixes is to use a in the active and u in the passive. However, u is used for both voices in triliteral Forms II, III and IV and in quadriliteral Form I. (Here again quadriliteral I mimics triliteral II.)
If the stem begins with a semivowel, adding a personal prefix may change j into w, as described under "Phonological Changes".
Personal endings are straightforward. They are different for each aspect and mood, but the same set (with one variation) is used for all Forms and for both active and passive voice. As follows:
Perfective indicative 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. katab-a katab-at katab-ta katab-ti katab-tu Dual katab-aa katab-ataa katab-tumaa Plur. katab-uu katab-na katab-tum katab-tunna katab-naa Imperfective indicative 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. jaktub-u taktub-u taktub-u taktub-iina ?aktub-u Dual jaktub-aani taktub-aani taktub-aani Plur. jaktub-uuna jaktub-na taktub-uuna taktub-na naktub-u Imperfective subjunctive 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. jaktub-a taktub-a taktub-a taktub-ii ?aktub-a Dual jaktub-aa taktub-aa taktub-aa Plur. jaktub-uu jaktub-na taktub-uu taktub-na naktub-a Imperfective jussive 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. jaktub taktub taktub taktub-ii ?aktub Dual jaktub-aa taktub-aa taktub-aa Plur. jaktub-uu jaktub-na taktub-uu taktub-na naktub Imperfective imperative 3rd masc. 3rd fem. 2nd masc. 2nd fem. 1st Any Sing. uktub uktub-ii Dual uktub-aa Plur. uktub-uu uktub-na
There are some relationships between the imperfective endings that make them easier to remember. Where the indicative has a single u, the subjunctive has a single a and the jussive has nothing; and where the indicative has a long vowel followed by na or ni, the subjunctive and jussive have just the long vowel.
The variation mentioned above occurs in the forms that normally have no ending (jussive sing. 3rd masc., 3rd fem. and 2nd masc., and the imperative sing. 2nd masc.). For these forms, geminate verbs (a class described elsewhere) take the ending a, e.g. dull-a point, instead of no ending. They also use a modified stem; the result of these two differences is that their jussives are the same as their subjunctives. See further under "Geminate Jussives and Imperatives".
For convenience, the Language Engine uses 0 to denote a "nil ending" and then removes it.
No Arabic word can begin with two consonants; however, this combination can be thrown up when a vowel is omitted or a form-marker added in forming the stems described above. In such cases a facilitative vowel (i or u) is added to the front of the word: inkatab from nkatab. For this reason the form-marker prefixes are given by some commentators as in and ista rather than n and sta.
For non-imperatives, the facilitative vowel is i in the active and u in the passive. For imperatives it is normally i; but (this can apply only to triliteral Form I) if either the second active perfective vowel or the second active imperfective vowel is u, then the facilitative vowel is u.
Adding a facilitative vowel to a verb beginning with a semivowel can bring about a change in the semivowel - see under "Phonological Changes".
"Unsound" is a catch-all term for systematic irregularities in verbs. Unsound verbs are of two types, geminate and weak. Geminate verbs are those where the last two consonants are the same, as in d-l-l point. (Some commentators use the term "doubled" instead of geminate.) Weak verbs are those where one or more of the consonants consists of the semivowels j or w, for example r-m-j throw. Verbs which show neither of these anomalies are called "sound".
Geminate verbs have two stems, one regular (or uncontracted) and one contracted. The contracted stem is formed in one of two ways, depending on the Form. One way is to remove the second vowel: XaadZdZ (Form III) from X-dZ-dZ. The other way is to swap the second consonant and the second vowel, as in dull from the regular Form I stem dlul. The normal method is the first. But with triliteral Form I imperfectives and with either aspect of triliteral Forms IV and X - that is, where the first vowel is removed as part of the normal stem-forming process and so the second vowel must remain - the second method is used instead.
The personal ending determines which stem is used: if it starts with a vowel then the contracted stem is used, but otherwise the regular stem is used. (Note however that this rule will not work for commentators who ignore some of the classical endings.)
Verbs of triliteral Form IX and quadriliteral Form IV double their final consonant and so become a special sort of geminate, with an uncontracted stem and a contracted stem. Their behaviour needs to be examined in some detail.
Weak verbs are called first-weak, second-weak and third-weak according to whether the first, second or third consonant is the semivowel: j-q-n, h-j-b, l-q-j. The term "hollow" is frequently used for second-weak. Stems can also be doubly weak - that is, two of the consonants can be semivowels.
First-weak verbs are straightforward: those with j and some of those with w (marked in the dictionary) are regular; the remaining ones with w lose the w, but only in the triliteral Form I active imperfective (where the first vowel is also deleted).
A few second-weak verbs with w retain it (this is marked in the dictionary) and so behave regularly, but the rest follow a more complicated pattern. They have two stems, a short stem, best thought of as a special form of the regular stem, and a long stem best regarded as anomalous. The rule for choosing between them is the same as for geminate verbs: if the personal ending starts with a vowel the long or anomalous stem is used; but if the ending starts with a consonant, or there is no personal ending, the short or regular stem is used. (And like the rule for geminate verbs, this rule will not work for commentators who ignore some endings.)
Both stems are formed by removing the first vowel (if not already removed) and the middle consonant from a notional regular stem; the remaining vowel is then coerced, in either active aspect of Forms VII and VIII, to a. For the short stem no further action is required. The long stem further coerces the vowel to a in triliteral Form I active perfective, then doubles the vowel:
----Active--- ---Passive--- Perf. Imperf. Perf. Imperf. Form I Notional stem: xawif xwaf xuwif xwaf Short stem: xif xaf xif xaf Long stem: xaaf xaaf xiif xaaf Form VII Notional stem: qajad qajid qujid qajad Short stem: qad qad qid qad Long stem: qaad qaad qiid qaad
Third-weak verbs look more complicated than they are: the strange-looking forms arise from a fairly simple set of rules dealing with conflicts between sounds.
Given that Arabic has three vowels and two semivowels, third-weak verbs can have one of six endings: iw, ij, aw, aj, uw and uj. iw and uj are illegal sequences, so iw changes to ij and uj must be assumed to change to uw (no examples to hand). This leaves four possible endings: ij, uw, aj and aw.
The behaviour of these four endings follows five patterns on the basis of the personal ending: no ending, endings starting with a consonant, endings starting with a short vowel (a or u, but excluding at), endings starting with a long vowel (aa, ii or uu), and endings starting with at.
These rules account for all third-weak forms.
Verbs can be doubly weak - that is, two of the consonants can be semivowels. w-l-j, for example, is both first-weak and third-weak, with the initial w following the rules for first-weak verbs and the final j following the rules for third-weak verbs.
Unsound verbs do not behave unsoundly in all Forms: in some Forms they simply follow the pattern for sound verbs. As follows:
Some changes are purely phonological, and apply to all verbs:
Passive verbs in Arabic cannot specify an agent: the passive of "Saleem wrote a book" is not "a book was written by Saleem" but merely "a book was written", and Saleem as the author cannot be included. In fact the meaning of the passive in Arabic is less specific than this, and this sentence would be better rendered as "a book got written", or "there was a writing of a book".
This looseness of meaning may account for the existence in Arabic of passive forms of intransitive verbs, difficult to make sense of in English. The passive of "they ran to the square" means something like "there was a running of people to the square", and the passive of "he was safe" means "he had a feeling of safety". So the Language Engine renders all passives into English using "there is / was" and a verbal noun.
In this, the Language Engine does not follow the western tradition of treating Arabic passives as though they were direct equivalents of English passives.
Some commentators describe the presence of the glottal plosive ? (or hamza) in the root as a form of unsoundness, similar to that created by the semivowels j and w. But among the verbs dealt with in the Language Engine the glottal plosive ? produces only six irregularities, detailed below. This is not enough to justify claims of systematic unsoundness.
Some commentators add an initial ? to all verb-forms beginning with a vowel followed by two consonants, e.g. ?inqalaba instead of inqalaba. ?- represents the post-pausal pronunciation; in juncture, by contrast, the pronunciation is -nqalaba (with the initial vowel being replaced by the terminal vowel of the previous word). Since the initial ? is simply one type of context-dependent phonetic adjustment, and not an inherent part of the word, the Language Engine does not add it.
A further problem arises for these commentators when they add an initial ? to a verb whose first radical is itself ?. This can create the string ?v?C, and this string is normally modified to ?vvC. So they change the imperative of ?-m-l hope (for example) from u?mul to ?u?mul and then to ?uumul. Since the Language Engine doesn't make the first change, it doesn't make the second change either.
The irregularities produced by the occurrence of ? in the root are as follows:
The Language Engine produces these forms by using a 'hamza flag' in the dictionary, though it might be better (since there are so few of them) to present them as simple exceptions.
In the jussive and imperative, in those persons where other verbs take a nil personal ending, the geminates take the ending a and use the contracted stem: jaSidda. The personal ending and the stem are related phonologically, so if geminates used the regular nil ending they could also use the regular (or uncontracted) stem: *jaSdid. However, they do not now do this: the uncontracted forms with nil ending do not exist in Modern Standard Arabic, and even in Classical Arabic their status alongside the contracted forms with the ending a was decidedly marginal.
Using the ending a creates a disturbance in the system, since without it the set of jussive and imperative personal endings would be constant across all verbs, as are the sets of personal endings for all other moods. This may be why a formidable number of contemporary western grammars, many of them from prestigious institutions, list the non-existent uncontracted forms, either alone and without comment, or as alternatives (and sometimes as the preferred alternatives) to the contracted forms.
The only form in the Language Engine which is strictly irregular, in the sense that it cannot be accounted for by any rule, is lajs for the active perfective stem of l-j-s not to be.