The American Speech Language Hearing Association (ASHA) position statement specifically defines Childhood Apraxia of Speech (CAS) as follows: "Childhood apraxia of speech (CAS) is a neurological childhood (pediatric) speech sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits (e.g., abnormal reflexes, abnormal tone). CAS may occur as a result of known neurological impairment, in association with complex neurobehavioral disorders of known and unknown origin, or as an idiopathic neurogenic speech sound disorder. The core impairment in planning and/or programming spatiotemporal parameters of movement sequences results in errors in speech sound production and prosody." . accordingly, researchers consider the core problem in CAS to be a deficit in the transcoding stage (planning/programming) of the speech production process. This stage involves translating the linguistic message into the details of which particular muscles are to be moved, including their sequence and timing, in order to express the message. In the motor execution stage, motor plans are sent through nerves to musculature, and physical movements (production) are performed in the speech-producing organs . The available literature has mostly pointed out the deficits of motor planning/programming in CAS [2-5]. However, it may also be equally important to consider the motor execution, because clinical experience has shown that it is through production and practice that motor speech planning is strengthened and memory is formed.
The case was a 6 years and 4 months boy with suspected CAS who the school psychologist diagnosed him with a developmental delay and referred him to a speech therapy clinic. The child was completely speechless, and the only way to communicate with those around him was to use gestures, body language and unintelligible vocalizations such as grunting. No language other than Persian was spoken at home and the dialect used in this study was Farsi, which is spoken in Tehran and some parts of Iran. Preliminary evaluation results showed that the child did not have any vision and hearing problems. He had received short speech therapy twice, neither of which was successful. Medical history has shown that he had a history of tonsillectomy and allergies. In terms of developmental history, standing and walking skills were acquired with moderate delay. The child had not yet fully mastered the skill of grasping pencil and independent toileting, and could not utter any words or even repetitive syllables. He could not repeat the sounds and words, but understood what those around him were saying and answered the yes/no, or who/what/where questions non-verbally. Due to attention and concentration deficits and the child's lack of cooperation, it was not possible to assess him at first. A team of SLP specializing in childhood apraxia (who was also the author of the current study), child psychologist, child development specialist and family examined his behavioral, communication, language, and speech problems.
The SLP assessed speech and language in a room devoid of distracting stimuli with applying a little physical restraint. Dynamic assessment of one and multiword receptive language skills within play format showed that the child has acceptable performance in those skills. The assessed structures at this stage were as follows:
• Person name and object name: In response to cues such as "X o peyda kon" in Persian ("Find X" in English), "X o neshun bede" in Persian ("Show me X" in English), or "X kojast?" in Persian ("Where's X?" in English) the child should show or look at objects or people around him or point to them.
• Action name: Asking the child to perform different actions on objects or self, such as "boxore" in Persian ("Eat" in English) or "gorbeh boxore" in Persian ("Catie eat" in English).
• Possessed + possessor: In response to cues such as "X e baba ku?" in Persian ("Where's Daddy's X?" in English) or "X e maman ku?" in Persian ("Where's Mommy's X?" in English) the child should point or indicate.
• Object + action: At your request, the child must respond with action and object such as "tupo bendaz" in Persian ("Push ball" in Engish) from a group of objects that might be pushed (truck).
• Agent + action: Requiring child to respond to both the agent and the action such as "arusak mixore" in Persian ("Doll eats" in English).
• X + negative: The structures in which the na-/ne- is added to the beginning of the modal auxiliaries or verbs are assessed. For example, have two boxes, one full, one empty. Say "kodum-yeki tup nadare?" in Persian ("Show me 'No ball' " in English).
• X + location: Asking the child to show objects in different location such as "gorbe tuye mashino neshun bede" in Persian ("Show me cat (in) car" in English).
• Entity + attribute: Placing two contrasting items (big-little) before the child and asking for one, as in "tupe kuchiko behem (neshun) bede" in Persian ("Show (give) me little ball" in English).
• Understanding one-, two-, and three-step commands: Assess whether the child is able to follow directions with different lengths and complexities during play.
Severe loss in the child's phoneme repertoire made it impossible to assess the expressive language skills. At this stage, the SLP assessed motor imitation, oral motor imitation abilities, and the sound-making ability. The results showed that if the child pays enough attention has acceptable performance in motor imitation (single actions → like clap hands, repetitive actions → like hit table twice and two different actions → like clap hands and then jumping). The child's performance in oral motor imitation was as follows: He performed skills such as blowing bubbles, blowing out cheeks, closing lips tightly, giving raspberries, licking, puckering, or smacking lips, and smiling before mirror with light touch to the area of his mouth by SLP. The child's performance in actions such as kissing and pouting, opening and closing mouth, and sticking out tongue was successful by providing a model from the SLP prior to imitation but difficulty with alternating movement even with mirror cueing. The slight decrease in the muscle tone of the tongue and lips was obvious. SLP recorded two 20-minute samples of the child's vocalization in situations of communication with parents and play with peers. A few vowel-like sounds, distorted vowels /e/ and /a/, and two consonant-vowel (CV) syllable in which the consonant was /b, n/ could be extracted from these samples. The child did not have the ability to imitate and repeat any words or even syllables; as a result, it was not possible to record any signs of inconsistency. Occasionally he imitated some consonants, such as /b/, /n/, /m/ but the symptoms of disrupted coarticulatory transitions between consonants and vowels were quite obvious. In assessing the sound-making ability, the child was unable to imitate SLP single-sound model, reduplicated vocal model (CVCV reduplicated), nonrepetitive multi-syllable (CVCV) vocal model, and even joining the SLP in simultaneous or alternating vocalizing. The results of intellectual functioning evaluation over the past year showed borderline nonverbal intelligence and low verbal intelligence. The child's attention deficit was so severe that he did not benefit any of the auditory, visual, and tactile cues. Total diagnostic evidence included poorer expressive language compared to receptive language, reduced phonetic inventory, impaired volitional oral movements, and disrupted coarticulatory transition in the syllable structure of CV indicated CAS. The SLP's recommendations were as follows: Choosing the right augmentative alternative communication devices to help the child's communication, and the treatment focused on speech praxis and improving the child's expressive language skills.
The long-term therapeutic goals for this case were as follows: 1. Use augmentative alternative communication efficiently in different communicative contexts, 2. Creating realistic expectations in parents and encourage them to cooperate in setting therapy targets, 3. Facilitate and improve motor planning, programming and execution for sequential speech movements, 4. Improving the suprasegmental aspects of the speech aligned with production. The SLP visited the child three 30-minute sessions a week. The treatment room was free of distractions and the child sat in front of the therapist with little physical control. At all stages, the two approaches of multisensory cueing and prompts for restructuring oral muscular phonetic targets (PROMPT) were the basis of treatment. In the first step, the goal was to stimulate vocalizations, sound productions, and consonant-vowel (CV) combinations using environmental sounds such as animal sounds, the sound of electrical appliances, car engines, whistles and the sound of blowing. The SLP combined different vocalizations with appropriate prosodic features and body movement to convey meaning. Each time, the therapist gently held the child's face toward himself and facilitated production with the help of auditory, visual and tactile cues. Initially, the therapist responded to any vocalization that immediately follows the model. If the child did not answer, the therapist then accompanied the sound with an action, such as hitting the ball and saying "poo" or he would taste something sweet and say "mmm." After 21 sessions, the child gained the ability to produce six Persian vowels (/a/, /æ/, /e/, /o/, /u/, /i/), the consonants (/p/, /f/, /m/, /x/, /sh/, /s/, /d/, /t/, /l/, /y/), the consonant-vowel (CV) combinations and (CVCV) reduplicated forms. At this stage, SLP with the help of the family prepared a list of useful and meaningful words with established syllabic structures (CV and CVCV) to be used in therapy sessions, homework and practice in real communication environments. Along with this stage, SLP began the training of CVC structures and nonrepetitive multi-syllable (CVCV) with different consonants and vowels. The multisensory cueing approach was still the main treatment method. Interestingly, if the child did not look at the therapist's face, it would be almost impossible to learn the movement pattern of the target word. The child seemed to rely heavily on his sense of sight to learn words. After 71 sessions, the child was able to produce 68 high frequency words with syllabic structures CV, CVCV, and CVC, but he had not yet started to produce some consonants (/k/, /g/, /q/, /z/, /v/, /j/, /r/). The child's words contained consonants and vowels that were in his sound inventory. Gradually, syllabic structures including CV-CVC, CVC-CVC, and CVC-CV were added to the treatment list. For each session, the therapist prepared a list of one or two words with new structures, and 3 or 4 useful phrases. A part of the each treatment session was spent practicing the previous words in real communication situations. At this point, the child was trying to imitate each new word from his surroundings or the television. Finally, after 138 sessions of treatment, the first syllabic structures with consonant clusters were added to the list of target words, but there was no success in learning these. At this stage, the child's vocabulary has reached more than 120 words and he was able to spontaneously produce about 30 functional phrases in everyday communication with others. During the week, the child spontaneously adds 2 to 4 new words to his vocabulary. At this time due to the COVID-19 epidemic, SLP stopped the face to face treatment sessions and he monitored the child's training exercises via video calls. It seemed that the progress of the treatment was acceptable and the family was satisfied with this. Accordingly, they wanted the child's treatment plan to continue. SLP decided to use the video modeling technique to practice the production of words and phrases. Pictures or videos of the mouth can be used to illustrate how specific sounds are produced. Previous research has also suggested the use of mouth pictures and videos in the treatment of CAS. Some of them are as follows: 1. Lindamood Phoneme Sequencing Program for Reading, Spelling, and Speech , 2. LipSync Moving Sound Formation Cards , 3. See It, Say It Sound Production Flip Book and Activities for Apraxia and More! . The Iowa University website also has videos of specific phoneme production . Such treatment facilities are for English only and cannot be used in Persian language. Many programs and apps available for CAS have been developed to facilitate production of the phonemes in isolation or consonant-vowel (CV) structures. For the first time, the researcher of the current study has proposed video modeling of the production of words and phrases in Persian language. This was not just a simple video modeling, but its design was based on the syllabic structure of the Persian language and the use of prosodic varieties. This technique can be used not only in video modeling but also in face-to-face sessions. The basis of this technique goes back to the theory of motor learning and articulation-based and rhythmic approaches. The unique features of this technique lead to encouraging the child to production word and simultaneously facilitate it. The main principles of this technique were as follows: 1. Consider the syllabic structure of the word, 2. Increase the melodic duration and loudness of vowels within the words and even accompanying it with hand movements 3. Emphasize on place and manner during the production of consonants, 4. Slow down the rate of production and repeating a lot to achieve mastery. The SLP prepared the videos from the front view and the mouth part of the face. Every week, 14 videos consisting of 6 new words, 2 new functional phrases, and 6 words of previous sessions were prepared and sent. Parents transferred video files to the child's smart tab, and he watched them many times with motivation and interest. Interestingly, sometimes he tried to produce simultaneously by looking at the model. Five weeks after the home quarantine, out of a total of 40 videos related to new words and functional phrases, the child learned to produce 17 words and 4 functional phrases correctly or approximately similar to the model. The treatment process is still ongoing, and in quarantine conditions, video modeling is the basis of treatment. At present, the child is able to produce more than 137 words and 34 phrases spontaneously.
The results of the current case report showed that to strengthen the motor speech planning, motor speech plans must be executed and repeated. Therefore, the aspect of motor execution is necessary for the motor speech planning. Numerous scientific literatures have pointed to the deficit of motor planning/programming as core impairment in their definitions of apraxia [1-5]. These definitions do not include important components such as "immaturity" of motor execution skills. It seems that a child with CAS, who has been deprived of speech production until the age of six, in addition to deficit in motor speech planning has also severe immaturity (no muscle weakness) in the motor execution skills and production. The results of a research conducted by Edeal and Gildersleeve-Neumann (2011) can also confirm the importance of executing motor plans and repeating them . It is likely that if proper interventions are not given at an early age, this immaturity will increase with age. The challenging question is "In the terminology of children with CAS similar to described child, is it enough to simply point out the deficit of motor speech planning in definitions?" For speech and language pathologists with little work experience, the presence of motor execution deficits is usually reminiscent of disorders such as dysarthria. Based on this, it is suggested that perhaps a few changes in the previous definitions of CAS add to the comprehensiveness of them.
Evidence from the present case also highlighted the importance of attentional skills and visual inputs in helping to motor speech planning and execution, so that learning was impossible without looking at the therapist's mouth. This point prompted the therapist to use video modeling to continue the treatment during home quarantine. The therapist and child's family achieved relatively satisfactory results. The use of multisensory cues to facilitate the accurate production of new motor speech plans has also been discussed in various articles [11-13]. Mouth pictures/videos can be one of the visual cues in a multisensory approach that its importance has been mentioned in various sources [6-8]. Many of these sources have focused on video production on how to produce sounds in isolation or in the form of syllables. However, the current study may have two innovations in this field. First, Persian language has its own unique syllabic structures, and so far researchers have not studied the use of video modeling in the treatment of CAS. A distinctive feature of the video modeling technique in the current study was the consideration of syllabic structures of the Persian language and the use of prosodic varieties to facilitate production. Second, the materials presented in this technique were not individual sounds or syllables but efficient single-words and phrases in daily life.
The present study provided the lowest level of evidence, so the researcher suggests that in order to comprehensively examine the thought-provoking points of this research, studies with proper sample size and methods should be designed.
The author received no financial support for the research, authorship, and publication of this article.