When you measure reliability, you are measuring how well your findings can be repeated. In other words, if you have a reliable questionnaire, you would get the same data if it was possible to repeat the same questionnaire again in the same way with the same people. The less reliable a tool is, the less likely it is going to give you data that you are looking for and the more likely your data will include other information which muddies the water. Another way of looking at it is that a more reliable tool is less noisy.
It's not only tools that need to be reliable. Our whole approach needs to be too. Obviously, in the real world of social science, we have to deal with noise all the time. But if this noise is truly random (e.g. the way someone feels on a particular day) and we have a good sample of subjects, this random noise will cancel itself out. For example, although some people feel terrible when they sit down to do a wordlist with you, others will feel better than normal. What we don't want to do is introduce noise through bad sampling for example choosing to take wordlists from high school students during exam week!
The more reliable your testing is, the more accurately you can generalise your findings beyond the few people that you gathered data from because the data have more validity. As the excellent guide to statistical reliability on experiment-resources.com says, "Simply put, reliability is a measure of consistency."
This is crucial for us as surveyors. If language surveying is truly to be language assessment, we must make sure that the small population we survey provide data which we can use to make assessments of the entire language or dialect group. If not, our findings cannot be used to inform language development programmes for entire communities of people.