Difference between revisions of "Administering an SRT"

Sentence Repetition Tests
Developing an SRT
Administering an SRT
Analysing SRT Data

Data Collection Tools
Interviews
Observation
Questionnaires
Recorded Text Testing
Sentence Repetition Testing
Word Lists
Participatory Methods
Matched-Guise

Latest revision as of 17:24, 12 July 2011

Pre-Testing

Before you do anything, you need to make sure you have everything in place for training the test administrators. To train them, you will need to carry out the following steps

Prepare the training materials ⇒ You will have already prepared the final test in both elaborated transcription and recorded form having worked through Developing an SRT. In addition, you will have the responses for at least 20 of the participants who helped you develop the SRT. You can use 10 of these responses to train test administrator s and then the other 10 for the third step below of ensuring administrator reliability. What you'll create are in effect 10 test recordings. Each one will have the 15 final test sentences and the responses to these of one of the 10 test development participants. You'll also need to prepare score sheets for your administrator training and these can be edited down to the final 15 sentences from the scoresheets you used in test development. We recommend that you encourage your administrators during training to note down the specific type of error they hear occuring. This is unnecessary on the field where a simple mark can be used, but in training, you need to know if they understand the differences between the types of errors so that they can score accurately.
Train the administrators ⇒ Your administrators will need to be familiar with several things before they can administer a test: they'll need to be familiar with each line of the transcription including IPA, they'll need to know the scoring system, they will have to be able to manage the practicalities of carrying out a test (see below for more) and they should be able to repeat all the test sentences and follow the transcript at the same time as the recording. At this point, they're able to actually listen to the 10 practice recordings that you've completed and score them one by one, comparing their answers to the master scoresheet. They should repeat this as many times as they need to to become confident scorers.
Ensure administrator reliability ⇒ Using the other 10 of the 20 recordings made during test development, you can make another set of recordings to help with administrator reliability. By using these, the administrators can test themselves to ensure that they are maintaining the standards required for scoring tests. Unlike step 2 above, the administrators should not pause the recording at any time or refer to the master scoresheet until they have scored the entire test. At this point, they can compare their score with that of the trainer to see how close they are. If an administrator is scoring either 5 above or below the trainer's total, they will need further training until they are more accurate in their assessments. You should also use these recordings to periodically re-calibrate administrator's skills if they are to administer this SRT over longer periods of time.

How do you select participants?

It is thus assumed that among any group of L1 speakers, the language they speak will be relatively uniform and so testing only a small number of any of them will give the same results as if we tested all of them. Thus, when testing to see if people can understand another language because of linguistic relatedness between their L1 and an L2, testing only a few speakers of the L1 will give you the information you need.

But SRTs are not tests of inherent intelligibility. They are used to test language proficiency... and proficiency varies not only from person to person but also from day to day for each language user.

This means we need to sample well also to bear in mind that our results may be influenced by variables as idiosyncratic as time of day or whether the participant's baby kept them up for hours the night before!

We should also ensure that participants are people who have no impediment to speaking clearly i.e. all their teeth, aren't chewing something, etc.

Screening Questionnaire

Once participants have been selected through a sampling method, we need to administer a questionnaire to gather basic demographic data. This will help to confirm that they are suitable for our research. It might also be helpful to include variables that might influence language learning in this questionnaire. The following is a list of some of the things such a questionnaire might include:

name
age
level of education
place of residence
profession
language spoken at home
clan
places travelled to
frequency of travel
purpose of travel
language/s spoken while travelling
relatives who speak the test language
patterns of exposure to the test language
patterns of use of the test language
preferences for language use
language attitudes
etc!

This list does not cover everything. But not everything would need to be covered in a screening questionnaire. The goals of the survey will have determined the need for and the purpose of the test. Use these factors to guide you as you construct the screening questionnaire and select the items that are most relevant for your test.

Once the participant has taken the test and you have a score for them, you can analyse their results and compare it to the information they have provided in this questionnaire. Good sampling means that any variables which are going to affect your data should be revealed through the questionnaire.

Testing

Having developed your SRT and trained your administrators, you are now ready to actually carry out the testing. Although development takes time (as any worthwhile survey tool should), the actual testing of an individual using an SRT is very quick. However, you still need to ensure you have the right environment for your test and that the administrators know how to create this environment.

You will need a device to play back your test recording and two sets of headphones connected by a Y-connector so that both the participant and the administrator can hear the recording at the same time. You might want to consider having someone to operate the recording and a third person to score. If you do this, you will need a third set of headphones and another Y-connector so that all three can hear the recording simultaneously.

In addition to a playback device, you might also consider using a second device to record the first 5-10 participants in each location. This serves as a reference which can be very helpful if there are slight dialectal differences which give consistent "errors" as people produce varieties of the test language that are more natural to them. It is important that you do not mark consistent dialectal differences, even lexical ones, as errors. If you do record the first group of participants' responses, don't forget to obtain informed consent.

Make sure you choose a location where there are as few distractions as possible for the participant. Onlookers should be told to remain as quiet as possible and not communicate with the participant. It is important to realise that if a participant waiting their turn overhears the repetitions of a current participant, it does not affect their future participation. They cannot hear the original recording and the amount of language and the time between them hearing it and having their own turn is too great for them to memorise what they hear. In any case, they have no knowledge of the accuracy of the responses they hear and this means that it is vital that the test administrator does not give verbal feedback on the participant's performance. Instead, general encouragement and praise for participating in the test and "knowing the language so well" should be given.

Scoring

You will need to prepare a very large number of scoresheets so that you have one new one for each participant at each location you are going to. If you are on a long survey and carrying such a large amount of paper around is prohibitive, consider using a netbook or some other device to store your data. If you do so, carry an external drive of some sort to back up data and make sure that this is kept separately from the device in case you lose one or the other and in a waterproof case.

Scoresheets should have the test sentences written in local script or IPA depending on what the administrator's preference is. There should be enough space around the sentence to mark errors. Each scoresheet will need space to add each participant's name, age, sex, mother tongue, participant ID number, etc.

The administrator should sit so that the participant cannot see what is written on the scoresheet, and bear in mind that if future participants are watching, they should not be able to see the scoresheets either, particularly if the sentences are in the local script. If the administrator only marks when errors are made, the participant may realise this and react to their errors being recorded. Therefore, it is important to make or be seen to make the same number of marks for good sentences as for erroneous ones.

Each of the words that the participant does not repeat perfectly should be marked. Three errors or more means that they score nothing for that sentence. Errors consist of the following:

omission - leaving out words that are in the recording
substitution - replacing a word in the recording with another word that is not in that place in the sentence in the recording
change of word order - swapping the position of words from one part of the sentence to another
pronunciation - not those which may result in a different accent but only changes which result in loss of intelligibility and so prevent communication
repetition - and this includes restarting a sentence or going back to repeat part of a sentence once started. If a restart is done to correct an error, one point is deducted for the original error but not the restart.
addition - putting anything into the sentence that is not in the original

Scoring is the same for all of these i.e. 1 point deducted for each no matter what type of error it is.

Note that in part 5 of our guide to developing an SRT we provided a key for scoring that includes suggested symbols that administrators can use to mark types of error. However, in a field testing situation, this level of detail is not only unnecessary, it also slows down the testing procedure too much. Simply marking the place where an error occurred is enough for an actual testing situation.

Once an error is made in a word, one point is deducted. Therefore, should a participant make two or more errors on the same word e.g. mispronouncing it unintelligibly and repeating it also, administrators should simply mark this as one error. Further errors have to occur on other words.

If, during testing, there is debate about what consistutes an error or how many errors may have occurred, this means that training of administrators has not been done thoroughly enough. Prior to a survey, the entire team should be familiar enough with the sentences and the potential errors that have been flagged from the pilot test recordings to know how to score the various errors that are likely to occur. Consider this test sentence

 The teacher repeated the sentence to the students.

and the response

 The teacher is saying sentence to the students.

How many points would you deduct?

There are three errors and three points should be deducted thus:

omission: the is missing
substitution: saying replaces repeated
addition: is is inserted

Good training will include plenty of discussion of situations like this example.

@@ Line 6: / Line 6: @@
 Before you do anything, you need to make sure you have everything in place for training the test administrators. To train them, you will need to carry out the following steps
 # '''Prepare the training materials''' &rArr; You will have already prepared the final test in both elaborated transcription and recorded form having worked through [[Developing an SRT]]. In addition, you will have the responses for at least 20 of the participants who helped you develop the SRT. You can use 10 of these responses to train test administrator s and then the other 10 for the third step below of ensuring administrator reliability. What you'll create are in effect 10 test recordings. Each one will have the 15 final test sentences and the responses to these of one of the 10 test development participants. You'll also need to prepare score sheets for your administrator training and these can be edited down to the final 15 sentences from the scoresheets you used in test development. We recommend that you encourage your administrators during training to note down the specific type of error they hear occuring. This is unnecessary on the field where a simple mark can be used, but in training, you need to know if they understand the differences between the types of errors so that they can score accurately.
-# '''Train the administrators''' &rArr;
+# '''Train the administrators''' &rArr; Your administrators will need to be familiar with several things before they can administer a test: they'll need to be familiar with each line of the transcription including [[IPA]], they'll need to know the scoring system, they will have to be able to manage the practicalities of carrying out a test (see below for more) and they should be able to repeat all the test sentences and follow the transcript at the same time as the recording. At this point, they're able to actually listen to the 10 practice recordings that you've completed and score them one by one, comparing their answers to the master scoresheet. They should repeat this as many times as they need to to become confident scorers.
-# '''Ensure administrator reliability''' &rArr;
+# '''Ensure administrator reliability''' &rArr; Using the other 10 of the 20 recordings made during test development, you can make another set of recordings to help with administrator reliability. By using these, the administrators can test themselves to ensure that they are maintaining the standards required for scoring tests. Unlike step 2 above, the administrators should not pause the recording at any time or refer to the master scoresheet until they have scored the entire test. At this point, they can compare their score with that of the trainer to see how close they are. If an administrator is scoring either 5 above or below the trainer's total, they will need further training until they are more accurate in their assessments. You should also use these recordings to periodically re-calibrate administrator's skills if they are to administer this SRT over longer periods of time.
 ===How do you select participants?===
@@ Line 43: / Line 43: @@
 ==Testing==
- (notes from Radloff section 2.5 and 3.6)
+Having [[Developing an SRT|developed your SRT]] and trained your administrators, you are now ready to actually carry out the testing. Although development takes time (as any worthwhile survey tool should), the actual testing of an individual using an SRT is very quick. However, you still need to ensure you have the right environment for your test and that the administrators know how to create this environment.
+You will need a device to play back your test recording and two sets of headphones connected by a Y-connector so that both the participant and the administrator can hear the recording at the same time. You might want to consider having someone to operate the recording and a third person to score. If you do this, you will need a third set of headphones and another Y-connector so that all three can hear the recording simultaneously.
+In addition to a playback device, you might also consider using a second device to record the first 5-10 participants in each location. This serves as a reference which can be very helpful if there are slight dialectal differences which give consistent "errors" as people produce varieties of the test language that are more natural to them. It is important that you do not mark consistent dialectal differences, even lexical ones, as errors. If you do record the first group of participants' responses, don't forget to obtain [[informed consent]].
+Make sure you choose a location where there are as few distractions as possible for the participant. Onlookers should be told to remain as quiet as possible and not communicate with the participant. It is important to realise that if a participant waiting their turn overhears the repetitions of a current participant, it does not affect their future participation. They cannot hear the original recording and the amount of language and the time between them hearing it and having their own turn is too great for them to memorise what they hear. In any case, they have no knowledge of the accuracy of the responses they hear and this means that it is vital that the test administrator does not give verbal feedback on the participant's performance. Instead, general encouragement and praise for participating in the test and "knowing the language so well" should be given.
 ==Scoring==
- (notes from Radloff section 2.6)
+You will need to prepare a very large number of scoresheets so that you have one new one for each participant at each location you are going to. If you are on a long survey and carrying such a large amount of paper around is prohibitive, consider using a netbook or some other device to store your data. If you do so, carry an external drive of some sort to back up data and make sure that this is kept separately from the device in case you lose one or the other and in a waterproof case.
+Scoresheets should have the test sentences written in local script or [[IPA]] depending on what the administrator's preference is. There should be enough space around the sentence to mark errors. Each scoresheet will need space to add each participant's name, age, sex, mother tongue, participant ID number, etc.
+The administrator should sit so that the participant cannot see what is written on the scoresheet, and bear in mind that if future participants are watching, they should not be able to see the scoresheets either, particularly if the sentences are in the local script. If the administrator only marks when errors are made, the participant may realise this and react to their errors being recorded. Therefore, it is important to make or be seen to make the same number of marks for good sentences as for erroneous ones.
+Each of the words that the participant does not repeat perfectly should be marked. Three errors or more means that they score nothing for that sentence. Errors consist of the following:
+* omission - leaving out words that are in the recording
+* substitution - replacing a word in the recording with another word that is not in that place in the sentence in the recording
+* change of word order - swapping the position of words from one part of the sentence to another
+* pronunciation - not those which may result in a different accent but only changes which result in loss of intelligibility and so prevent communication
+* repetition - and this includes restarting a sentence or going back to repeat part of a sentence once started. If a restart is done to correct an error, one point is deducted for the original error but not the restart.
+* addition - putting anything into the sentence that is not in the original
+Scoring is the same for all of these i.e. 1 point deducted for each no matter what type of error it is.
+Note that in part 5 of [[Developing an SRT#Pilot Testing|our guide to developing an SRT]] we provided a key for scoring that includes suggested symbols that administrators can use to mark types of error. However, in a field testing situation, this level of detail is not only unnecessary, it also slows down the testing procedure too much. Simply marking the place where an error occurred is enough for an actual testing situation.
+Once an error is made in a word, one point is deducted. Therefore, should a participant make two or more errors on the same word e.g. mispronouncing it unintelligibly and repeating it also, administrators should simply mark this as one error. Further errors have to occur on other words.
+If, during testing, there is debate about what consistutes an error or how many errors may have occurred, this means that training of administrators has not been done thoroughly enough. Prior to a survey, the entire team should be familiar enough with the sentences and the potential errors that have been flagged from the pilot test recordings to know how to score the various errors that are likely to occur. Consider this test sentence
+  ''The teacher repeated the sentence to the students.''
+and the response
+  ''The teacher is saying sentence to the students.''
+How many points would you deduct?
+There are three errors and three points should be deducted thus:
+* '''omission''': ''the'' is missing
+* '''substitution''': ''saying'' replaces ''repeated''
+* '''addition''': ''is'' is inserted
+Good training will include plenty of discussion of situations like this example.
+[[Category:Sentence_Repetition_Testing]]

Difference between revisions of "Administering an SRT"

Latest revision as of 17:24, 12 July 2011

Contents

Pre-Testing

How do you select participants?

Screening Questionnaire

Testing

Scoring

Navigation menu

Page actions

Page actions

Personal tools

Quick Links

User Stuff

Tools

Search