FEATURED ARTICLES

Machine Power and Fuzzy Decision making applied to Surveys

Dr Juergen Ude Dr Juergen Ude

PROBLEMS WITH MAINSTREAM TECHNOLOGY ALONE

Considerable academic research has been performed on Survey Analysis, under the topics of Categorial Analysis and Contingency Tables. Much of the technology originated between 50 to 300 years ago and can be considered out-of-date. The extensive research to this point in time has shown that there are many grey areas in the theory which have resulted in what amounts to band-aid solutions to cope with the various contingencies. Unfortunately, band-aids come off.

Although the research has provided invaluable insights, there are many issues, some of which have been pointed out by researchers themselves. At the risk of generalizing too much, most of the papers are written for an academic target audience. These tend to make use of acronyms, squiggly symbols, technical jargon and references that assume a level of knowledge that not everyone has. At times there seems to be an overcomplication to demonstrate academic-know-how. This is understandable in a university, or a similar environment, especially when students write papers. But unfortunately, this makes it hard, even for peers to truly understand the contents. The unfortunate result of this practice is that many useful contributions lie idle in academic papers. Another unfortunate aspect is that some of the work is applied in practice, without fully understanding the limitations and suitability. Errors in the theory may not be detected by peer reviewers because of the unnecessarily complex presentation.

Another issue with the research is that there appears to be too great an emphasis on precision. This of course depends on the application, but nevertheless is a problem in the real world. Precision on its own is not a problem and indeed desirable and it is understood that the focus in a university or research environment must be on precision. How else will technology improve? However, the problem for surveys is that there are many uncontrollable factors making precision impossible and thus attempts to apply it pointless. Under once set of circumstances one approach is recommended and under another set a different, which becomes impractical to apply and in the wrong hands will result in misapplications and wrong decisions.

The question to be asked is why perfection is so necessary. If it is possible why not, but if not then why? Papers have been written on third decimal place errors. One can use the analogy of using a ‘surgeon in a butcher’s shop’ in a practical environment. If the reported p value is 0.05 does it practically matter that the actual p-value is 0.02 or even 0.07? Do we know the magical number that is correct? Recall that p value is the probability of a false alarm. If a p value is below say .05, we usually reject the null hypothesis, (e.g. there is not a linear trend) for the alternative hypothesis, (e.g. there is a linear trend), knowing there is a 5% probability that it is a wrong decision.

Setting levels of significance and confidence intervals e.g. 95% Confidence Intervals has never been about obtaining exact 95% intervals. It has been about obtaining an indication of reliability. A 95% interval simply means that we can be reasonably certain that the interval contains the parameter of concern. A 99% interval means we can be highly certain. The values of 95 and 99 are not sacrosanct. These values can never be precise because it is impossible for this to be the case, considering factors such as sampling randomness and the impossibility of fitting a theoretical probability distribution to fit the actual distribution exactly. We are in the realm of a fuzzy environment which is a reality where precision is futile and where better decisions are made by being realistic.

Probability, the likelihood of obtaining a response, is a measure of the expected proportion of a response, if a test were carried out an infinite number of times. In more recent times it is seen as a tendency of obtaining an outcome for a one-off event. To put some perspective on the difference in interpretation consider the following example.

If we toss a balanced coin in the long run there will be 50% of heads. At this point in time the .5 probability does not really help in decision making, yet of course it does give a feel for the likelihood of the outcome. The reality is that the coin will be either a head or a tail. Long run occurrence of a head is not relevant in a one-off decision situation. For bookmakers, casinos probabilities and odds are highly important to ensure that in the long run they do not lose. For the gambler probabilities/odds only make the gambler feel good if the odds are high, but the fact that an event may or may not occur is what is important, not long-term occurrence.

For example, if a coin is weighted and the probability of obtaining a head were to be 99%, would one be confident to risk one’s life if a tail occurs, just to win ten million dollars if a head occurs. A reasonable person would not risk their lives on probability if there is even the tiniest risk. It is not about probability for one off decisions, but possibility. The same is applicable to surveys. Surveys are not about long-term correctness. Decisions are made on a survey by survey basis. A politician is not interested in the survey being 95% correct in the long run, he or she is interested in being correct now. Politicians are interested in the possibility of losing an election, not probability. Probability only makes them feel good if in their favor.

Incorrect models are also an issue. For example, Logistic Regression models are highly popular, modelling the data according to an s-curve (sigmoid function). Yet, the model is not very flexible. Many curves do not follow the sigmoid function. The reliability of these models fails when sample size is unbalanced, often the case with surveys. They are not robust to outliers due to bad and ‘fluke’ sampling.

The concept of significance is also a contentious issue. Statistical Significance only means that there is some evidence that the association is not just due to the way the ’numbers fell’ from sampling. It does not mean the result is practically significant. Lack of statistical significance does not mean the result is not practically significant. Lack of evidence only means no evidence, not non-existence of an effect.

Statistically significant itself may be a wrong conclusion. It has many sources of error, especially with surveys, where sampling is not purely random.

Several statistical significance test have been proposed by researchers based on an underlying assumed model. Yet, these models are rarely correct, invalidating the significance test results. A positive test result does not mean the best model is used. If the number of categories is small, often the case in surveys, statistically significant positive indications of a model can be shown, especially if sample size is unbalanced, also a common occurrence in surveys.

Sensitivity analysis is ignored mainstream technology applications. What effect does a rogue result have?

MACHINE-POWERED FUZZY DECISION MAKING

Machine powered fuzzy decision making is not about ignoring precision, significance testing, and modeling. If precise methods can be provided and are not confusing, and don’t depend on the situation, without risk of misapplication then it is foolish not to use these. If significance testing can be bought into the realms of practical reality. then it would be foolish not to use it. If models can be provided which truly reflect the actual situation then it would be foolish not to use them. But that is often not the case with surveys.

Unfortunately, analysis has reached a situation where many of its methods have reached mythical status, are over complicated, and provide a false level of confidence with wrong conclusions drawn as a consequence. The reason is very simple. The real-world is fuzzy not precise. There is no point in using a ‘surgeon in a butcher’s shop’.

There has been considerable resistance by academics to fuzzy technology because Fuzzy Technology is inconsistent with precise technology. However, with growing use of computer power there has been a growth in application of this technology since the1970s. Unfortunately, fuzzy decision theory has started to also focus on using ‘precise’ theory, which is not realistic. It is not possible to obtain precise solutions in a fuzzy real-world environment.

BIS.Net Analyst uses a new approach. It is realistic in that it accepts the real -world is fuzzy. But it makes no attempt to use precise fuzzy technology, because that is making the same mistake current analysis technology makes. It is unrealistic. Instead BISNET Analyst uses robust mainstream technology as a starting point. It does not ignore it and recognizes the many contributions made over by academia. Robust is not perfect but is applicable in a fuzzy world. Instead of attempting to provide a 95% confidence interval it accepts that for surveys it is virtually impossible to guarantee that the actual confidence interval equals the specified level. The same for p values. Fuzzy concept is not about a 95% confidence interval but about an ABOUT 95% Confidence Interval. The objective of setting levels of significance is only to sperate reasonable level of confidence, from high level of confidence. The actual numbers are not relevant.

Surveys are more about possibility instead of probability because probability is a concept relative to repeated decision making, as explained above. BISNET Analyst enhances the decision process by using machine power to provide addition information enabling the analyst to obtain better information on the possibility than mainstream technology alone provides. Please refer to the Knowledge-Center for information on how, with examples from the various survey analysis technologies provided to demonstrate the wrong conclusions that can be drawn with Mainstream Technology alone.

To conclude machine powered fuzzy decision making does not ignore mainstream technology. It uses it. It is about robust technology in a fuzzy environment. It is about using machine power to provide information on possibility as opposed to probability. However, it still provides mainstream information on probability for those who need it. Machine powered fuzzy decision making is about enhancing mainstream technology by being realistic in a fuzzy world

Read the next article in the featured articles series?

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

Analytics APPS for PC

Hundreds of analysis! Choice of classics - OR machine-power for more precise insights! Comprehensive features and tools with a resource center! Always up-to-date! Fast support!

Build your own suite of analytics with only the APPS you need, launched from the one pad. Pay-per-use from 50 USD cents per analysis through allocated credits, payable only when credits are used up and you wish to continue!

Dont need the APP? Only want to use the 'classics'? Visit BIS.Net Analyst Online and access hundreds of mainstream analytics for FREE!