Difference between revisions of "Talk:Interpreting Word List Data"

From SurveyWiki
Jump to navigationJump to search
 
(4 intermediate revisions by one other user not shown)
Line 6: Line 6:
 
As I started looking into how to interpret the calculated lexical similarity percentages after I analyzed it in WORDSURV, I found that Joseph Grimes and Gary Simons were recommending a 60% cutoff. However, in several survey reports I have reviewed, and on the Survey Wiki site which you updated, under ‘Interpreting Word List Data’ 70% is the recommended cutoff. Hmm.
 
As I started looking into how to interpret the calculated lexical similarity percentages after I analyzed it in WORDSURV, I found that Joseph Grimes and Gary Simons were recommending a 60% cutoff. However, in several survey reports I have reviewed, and on the Survey Wiki site which you updated, under ‘Interpreting Word List Data’ 70% is the recommended cutoff. Hmm.
  
I checked around, and I was given this paper by Douglas Boone written in 2007 for a presentation at AFLAC, which was helpful for me to have a bit more background on what this cutoff percentage is based on, and the reliability of it.  
+
I checked around, and I was given a paper by Douglas Boone written in 2007 for a presentation at AFLAC, which was helpful for me to have a bit more background on what this cutoff percentage is based on, and the reliability of it.  
 
This is the relevant portion of his paper:
 
This is the relevant portion of his paper:
 
   
 
   
Line 22: Line 22:
 
Marcus Hansley
 
Marcus Hansley
  
== What is the standard cutoff percentage? -- [[User:Marcus Hansley|Marcus Hansley]] 01:53, 21 October 2011 (PDT) -- [[User:Marcus Hansley|Marcus Hansley]] 01:53, 21 October 2011 (PDT) ==
+
===Re: What is the standard cutoff percentage? -- [[User:John Grummitt|John Grummitt]] 21:57, 23 October 2011 (PDT)===
  
== What is the standard cutoff percentage? -- [[User:Marcus Hansley|Marcus Hansley]] 01:46, 21 October 2011 (PDT) ==
+
: Hi Marcus,
  
John,
+
: Great to get your email. I very much appreciate you writing. First, a little background on myself. I’ve been in Papua New Guinea less than a year and have only finished one survey myself. On that survey, a team member did all the [[lexicostatistics|lexicostatistical analysis]] and, although I followed along, I was not involved in enough detail to consider myself anything like an authority on [[word lists]]. So, just to let you know that I have compiled much of what is on SurveyWiki from other sources and if there’s any ambiguity there, that’s simply a reflection of my inexperience.
  
I am writing my first survey report and am getting into the interpretation of the wordlists that we collected.  
+
: That said, I will have copied the information about analysing word lists from recognised authorities. One of these was Ramzi Nahhas & Noel Mann who wrote a paper called Steps of Word Lists while working at Payap University in Thailand. This document, written in 2006, serves as the source for much of the methodology that is on SurveyWiki and, at the bottom of the page [[Preparing a Word List]] I acknowledge this source in a footnote.
As I started looking into how to interpret the calculated lexical similarity percentages after I analyzed it in WORDSURV, I found that Joseph Grimes and Gary Simons were recommending a 60% cutoff. However, in several survey reports I have reviewed, and on the Survey Wiki site which you updated, under ‘Interpreting Word List Data’ 70% is the recommended cutoff. Hmm.
 
  
I checked around, and I was given a paper by Douglas Boone written in 2007 for a presentation at AFLAC, which was helpful for me to have a bit more background on what this cutoff percentage is based on, and the reliability of it.  
+
: I’d like to make two comments and then a suggestion. My first comment is about the diverse contexts that surveyors work in worldwide. Here in PNG, we have 837 living languages to sort out in very little geographic area with very small populations. Most of the remaining languages we have to survey have less than 1000 speakers. When we look at one particular variety and consider its context, we are dealing with a vastly different situation than that of, say, Southern Sudan which has a much larger population spread over far greater distances. Because of these differences in context, when we talk at a global level as we are doing, reducing our methodology to arguing about accuracy within 10% without considering context is, in my opinion, meaningless. This is why Nahhas & Mann say that some teams use a higher percentage of 70%. What they’re in fact saying is that some teams recognise that, for their context, 60% has been shown to be inaccurate and a higher cut off is needed. I would guess that they are referring to teams they have worked on in Southeast Asia which has a similar linguistic makeup to PNG. The 60% Grimes et al. refer to is undoubtedly based on Casad’s seminal work in 1970s Mexico which could not be more different sociolinguistically from where those of us who use 70% work.
This is the relevant portion of his paper:
 
 
‘APPENDIX A. ON THRESHOLDS AND CRITERIA
 
As far as I am aware, the published basis for the practice of using a threshold figure for inferring either possible comprehension or probable non-comprehension is the work of Gary Simons and of Joseph and Barbara Grimes. Simons (1979/1983) suggested that comprehension could be predicted fairly well based on vocabulary similarity, while J. Grimes (1988/1992) maintained that the correlation is usually too weak to be of value.  However, they were in agreement that intelligibility is not expected in cases of less than 60% shared vocabulary.
 
Some survey teams use 70% as their threshold.  This is in keeping with the “Simplified Flow Chart for Decision Making” that can be found at the beginning of the Survey Reference Manual (“General Considerations” section) and with the “Language Assessment Criteria” that came out of the first International Language Assessment Conference in 1989.
 
Simons observes that for his combined data, “above 60 percent similarity, intelligibility steadily rises”. In his model, expected comprehension at 60% similarity is only about 20–30%. Expected comprehension at 70% similarity is about 45–50%, still too low to suggest the possibility of shared literature.  In his review, Grimes considers not expected values, but the highest scores obtained in actual testing; he observes: “Vocabulary similarity percentages of 60 percent and below go consistently with intelligibility measured at 67 percent and below on simple narrative material.”  In some of the cases that they studied, the “vocabulary similarity” percentage represents cognates; in others the basis for evaluating similarity is doubtless more impressionistic.
 
  
+
: The second point is different but related. When we draw our conclusions in a survey report, it would be a rash surveyor who based them heavily on lexicostatistical data alone. To say that two language communities can understand each other because we have a number between 60 and 70 is, again, meaningless. None of the tools that I’m aware of in the survey field are rigid enough to be used in isolation to give us conclusive findings for language project recommendations. [[Triangulation]] is an absolute necessity. I would expect that lexicostatistical findings on any survey would have to be replicated through the use of at least one other method whose results demonstrated the same. Were this not to be the case, then I would say so in my report and consider that neither method has given me anything conclusive. In PNG, we often find that it is our sociolinguistic data, rather than our linguistic data alone, which give us the richer picture. Now that may not be the case in your scenario. But here we have very complex social interrelations between language groups who often live ten minutes walk away from each other and claim to speak different languages. If we do the lexicostatistics, we may find ourselves looking at 80% similarity. Yet they refuse to accept this socially. Make a recommendation in this case for a project based on the lexicostatistics alone and you will doom a language team to failure. Linguistic identity is the trump card here and it rarely shows itself through lexicostatistical analysis.
So it looks like 60% is a safe cutoff to use, but 70% can often be used as a good rule of thumb.
 
What is your take on it?
 
 
Thanks,
 
 
Marcus Hansley
 
 
 
== What is the standard cutoff percentage? -- [[User:Marcus Hansley|Marcus Hansley]] 02:16, 21 October 2011 (PDT) -- [[User:Marcus Hansley|Marcus Hansley]] 02:16, 21 October 2011 (PDT) ==
 
  
== What is the standard cutoff percentage? -- [[User:Marcus Hansley|Marcus Hansley]] 01:46, 21 October 2011 (PDT) ==
+
: Again, thanks for the opportunity to clarify this. I very much appreciate it.
  
John,
+
: John
 
 
I am writing my first survey report and am getting into the interpretation of the wordlists that we collected.
 
As I started looking into how to interpret the calculated lexical similarity percentages after I analyzed it in WORDSURV, I found that Joseph Grimes and Gary Simons were recommending a 60% cutoff. However, in several survey reports I have reviewed, and on the Survey Wiki site which you updated, under ‘Interpreting Word List Data’ 70% is the recommended cutoff. Hmm.
 
 
 
I checked around, and I was given a paper by Douglas Boone written in 2007 for a presentation at AFLAC, which was helpful for me to have a bit more background on what this cutoff percentage is based on, and the reliability of it.
 
This is the relevant portion of his paper:
 
 
‘APPENDIX A. ON THRESHOLDS AND CRITERIA
 
As far as I am aware, the published basis for the practice of using a threshold figure for inferring either possible comprehension or probable non-comprehension is the work of Gary Simons and of Joseph and Barbara Grimes.  Simons (1979/1983) suggested that comprehension could be predicted fairly well based on vocabulary similarity, while J. Grimes (1988/1992) maintained that the correlation is usually too weak to be of value.  However, they were in agreement that intelligibility is not expected in cases of less than 60% shared vocabulary.
 
Some survey teams use 70% as their threshold.  This is in keeping with the “Simplified Flow Chart for Decision Making” that can be found at the beginning of the Survey Reference Manual (“General Considerations” section) and with the “Language Assessment Criteria” that came out of the first International Language Assessment Conference in 1989.
 
Simons observes that for his combined data, “above 60 percent similarity, intelligibility steadily rises”.  In his model, expected comprehension at 60% similarity is only about 20–30%.  Expected comprehension at 70% similarity is about 45–50%, still too low to suggest the possibility of shared literature.  In his review, Grimes considers not expected values, but the highest scores obtained in actual testing; he observes: “Vocabulary similarity percentages of 60 percent and below go consistently with intelligibility measured at 67 percent and below on simple narrative material.”  In some of the cases that they studied, the “vocabulary similarity” percentage represents cognates; in others the basis for evaluating similarity is doubtless more impressionistic.’
 
 
 
 
So it looks like 60% is a safe cutoff to use, but 70% can often be used as a good rule of thumb.
 
What is your take on it?
 
 
Thanks,
 
 
Marcus Hansley
 

Latest revision as of 21:57, 23 October 2011

What is the standard cutoff percentage? -- Marcus Hansley 01:46, 21 October 2011 (PDT)

John,

I am writing my first survey report and am getting into the interpretation of the wordlists that we collected. As I started looking into how to interpret the calculated lexical similarity percentages after I analyzed it in WORDSURV, I found that Joseph Grimes and Gary Simons were recommending a 60% cutoff. However, in several survey reports I have reviewed, and on the Survey Wiki site which you updated, under ‘Interpreting Word List Data’ 70% is the recommended cutoff. Hmm.

I checked around, and I was given a paper by Douglas Boone written in 2007 for a presentation at AFLAC, which was helpful for me to have a bit more background on what this cutoff percentage is based on, and the reliability of it. This is the relevant portion of his paper:

‘APPENDIX A. ON THRESHOLDS AND CRITERIA As far as I am aware, the published basis for the practice of using a threshold figure for inferring either possible comprehension or probable non-comprehension is the work of Gary Simons and of Joseph and Barbara Grimes. Simons (1979/1983) suggested that comprehension could be predicted fairly well based on vocabulary similarity, while J. Grimes (1988/1992) maintained that the correlation is usually too weak to be of value. However, they were in agreement that intelligibility is not expected in cases of less than 60% shared vocabulary. Some survey teams use 70% as their threshold. This is in keeping with the “Simplified Flow Chart for Decision Making” that can be found at the beginning of the Survey Reference Manual (“General Considerations” section) and with the “Language Assessment Criteria” that came out of the first International Language Assessment Conference in 1989. Simons observes that for his combined data, “above 60 percent similarity, intelligibility steadily rises”. In his model, expected comprehension at 60% similarity is only about 20–30%. Expected comprehension at 70% similarity is about 45–50%, still too low to suggest the possibility of shared literature. In his review, Grimes considers not expected values, but the highest scores obtained in actual testing; he observes: “Vocabulary similarity percentages of 60 percent and below go consistently with intelligibility measured at 67 percent and below on simple narrative material.” In some of the cases that they studied, the “vocabulary similarity” percentage represents cognates; in others the basis for evaluating similarity is doubtless more impressionistic.’


So it looks like 60% is a safe cutoff to use, but 70% can often be used as a good rule of thumb. What is your take on it?

Thanks,

Marcus Hansley

Re: What is the standard cutoff percentage? -- John Grummitt 21:57, 23 October 2011 (PDT)

Hi Marcus,
Great to get your email. I very much appreciate you writing. First, a little background on myself. I’ve been in Papua New Guinea less than a year and have only finished one survey myself. On that survey, a team member did all the lexicostatistical analysis and, although I followed along, I was not involved in enough detail to consider myself anything like an authority on word lists. So, just to let you know that I have compiled much of what is on SurveyWiki from other sources and if there’s any ambiguity there, that’s simply a reflection of my inexperience.
That said, I will have copied the information about analysing word lists from recognised authorities. One of these was Ramzi Nahhas & Noel Mann who wrote a paper called Steps of Word Lists while working at Payap University in Thailand. This document, written in 2006, serves as the source for much of the methodology that is on SurveyWiki and, at the bottom of the page Preparing a Word List I acknowledge this source in a footnote.
I’d like to make two comments and then a suggestion. My first comment is about the diverse contexts that surveyors work in worldwide. Here in PNG, we have 837 living languages to sort out in very little geographic area with very small populations. Most of the remaining languages we have to survey have less than 1000 speakers. When we look at one particular variety and consider its context, we are dealing with a vastly different situation than that of, say, Southern Sudan which has a much larger population spread over far greater distances. Because of these differences in context, when we talk at a global level as we are doing, reducing our methodology to arguing about accuracy within 10% without considering context is, in my opinion, meaningless. This is why Nahhas & Mann say that some teams use a higher percentage of 70%. What they’re in fact saying is that some teams recognise that, for their context, 60% has been shown to be inaccurate and a higher cut off is needed. I would guess that they are referring to teams they have worked on in Southeast Asia which has a similar linguistic makeup to PNG. The 60% Grimes et al. refer to is undoubtedly based on Casad’s seminal work in 1970s Mexico which could not be more different sociolinguistically from where those of us who use 70% work.
The second point is different but related. When we draw our conclusions in a survey report, it would be a rash surveyor who based them heavily on lexicostatistical data alone. To say that two language communities can understand each other because we have a number between 60 and 70 is, again, meaningless. None of the tools that I’m aware of in the survey field are rigid enough to be used in isolation to give us conclusive findings for language project recommendations. Triangulation is an absolute necessity. I would expect that lexicostatistical findings on any survey would have to be replicated through the use of at least one other method whose results demonstrated the same. Were this not to be the case, then I would say so in my report and consider that neither method has given me anything conclusive. In PNG, we often find that it is our sociolinguistic data, rather than our linguistic data alone, which give us the richer picture. Now that may not be the case in your scenario. But here we have very complex social interrelations between language groups who often live ten minutes walk away from each other and claim to speak different languages. If we do the lexicostatistics, we may find ourselves looking at 80% similarity. Yet they refuse to accept this socially. Make a recommendation in this case for a project based on the lexicostatistics alone and you will doom a language team to failure. Linguistic identity is the trump card here and it rarely shows itself through lexicostatistical analysis.
Again, thanks for the opportunity to clarify this. I very much appreciate it.
John