A paradoxical effect of thorough examination of planning pros and cons

In the Fog Island Tavern:

– Bog-Hubert, I hear you had a big argument you had in here with Professor Balthus last night? Sounds like I missed a lot of fun?
– Well, Sophie, I’m not sure it was all fun; at least the good prof seemed quite put out about it.
– Oh? Did you actually admit you haven’t read his latest fat book yet?
– No. Well, uh, I haven’t read the book yet. And he knows it. But it actually was about one of Abbé Boulah’s pet peeves, or should i say his buddy’s curious findings, that got him all upset.
– Come on, do tell. What about those could upset the professor — I thought he was generally in favor of the weird theories of Abbe Boulah’s buddy?
– Yes — but it seems he had gotten some hopes up about some of their possibilities — mistakenly, as I foolishly started to point out to him. He thought that the recommendations about planning discourse and argument evaluation they keep talking about might help collective decision-making achieve more confidence and certainty about the issues they have to resolve, the plans they have to adopt or reject.
– Well, isn’t that what they are trying to do?
– Sure — at least that was what the research started out to do, from what I know. But they ran into a kind of paradoxical effect: It looks like the more carefully you try to evaluate the pros and cons about a proposed plan, the less sure you end up being about the decision you have to make. Not at all the more certain.
– Huh. That doesn’t sound right. And the professor didn’t straighten you out on that?
– I don’t think so. Funny thing: I started out agreeing that he must be right: Don’t we all expect decision-makers to carefully examine all those pros and cons, how people feel about a proposed plan, until they become confident enough — and can explain that to everybody else — that the decision is the right one? But when I began to explain Abbé Boulah’s concern — as he had mentioned it to me some time ago — I became more convinced that there’s something wrong with that happy expectation. And that is what Abbé Boulah’s research seems to have found out.
– You are speaking strangely here: on examination, you became more convinced that the more we examine the pros and cons, the less convinced we will get? Can you have it both ways?
– Yeah, it’s strange. Somebody should do some research on that — but then again, if it’s right, will the research come up with anything to convince us?
– I wish you’d explain that to me. I’ll buy you a glass of Zinfandel…
– Okay, maybe I need to rethink the whole thing again myself. Well, let me try: Somebody has proposed a plan of action, call it A, to remedy some problem or improve some condition. Or just to do something. Make a difference. So now you try to decide whether you’d support that plan, or if you were king, whether you’d go ahead with it. What do you do?
– Well, as you said: get everybody to tell you what they see as the advantages and disadvantages of the plan. The pros and cons.
– Right. Good start. And now you have to examine and ‘weigh’ them, carefully, like your glorious leaders always promise. You know how to do that? Other than to toss a coin?
– Hmm. I never heard anybody explain how that’s done. Have to think about it.
– Well, that’s what Abbé Boulah’s buddy had looked at and developed a story about how it could be done more thoroughly. He looked at the kinds of arguments people make, and found the general pattern of what he calls he ‘standard planning argument’.
– I’ve read some logic books back in school, never heard about that one.
– That’s because logic never did look at and identified let alone studied those. Not sure why, in all the years since ol’ Aristotle…
– What do they look like?
– You’ve used them all your life, just like you’ve spoken prose all your life and didn’t know it. The basic pattern is something like this: Say you want to argue for a proposed plan A: You start with the ‘conclusion’ or proposal:
“Yes, let’s implement plan A
because
1. Plan A will result in outcome B — given some conditions C;
and we assume that
2. Conditions C will be present;
and
3. We ought to aim for outcome B.”
– It sounds a little more elaborate than…
– Than what you probably are used to? Yes, because you usually don’t bother to state the premises you think people already accept so you ‘take them for granted’.
– Okay, I understand and take it for granted. And that argument is a ‘pro’ one; I assume that a ‘con’ argument is basically using the same pattern but with the conclusion and some premises negated. So?
– What you want to find out is whether the decision ‘Do A’ is plausible. Or better: whether or to what extent it is more plausible than not to do A. And you are looking at the arguments pro and con because you think that they will tell you which one is ‘more plausible’ than the other.
– Didn’t you guys talk about a slightly different recipe a while back — something about an adapted Poppa’s rule about refutation?
– Amazing: you remember that one? Well, almost: it was about adapting Sir Karl Raimund Popper’s philosophy of science principle to planning: that we are entitled to accept a scientific hypothesis as tentatively supported or ‘corroborated’ as they say in the science lab, to the extent we have done our very best to refute it, — show that it is NOT true, — and it has resisted all those attempts and tests. Since no supporting evidence’ can ever conclusively ‘prove’ the hypothesis but one true observation of the contrary can conclusively disprove it. It’s the hypothesis of that all swans are white — never proved by any number of white swans you see, but conclusively shot down by just one black swan.
– So how does it get adapted to planning? And why does it have to be adapted, not just adopted?
– Good question. In planning, your proposed plan ‘hypothesis’ isn’t true or false — just more or less plausible. So refutation doesn’t apply. But the attitude is basically the same. So Abbé Boulah’s buddy’s adapted rule says: “We can accept a plan proposal as tentatively supported only to the extent we have not only examined all the arguments in its favor, but more importantly, all the arguments against it — and all those ‘con’ arguments have been shown to be less plausible or outweighed by the ‘pro’ arguments.”
– Never heard that one before either, but it sounds right. But you keep saying ‘plausible’? Aren’t we looking for ‘truth’? For ‘correct’ or ‘false’?
– That’s what Abbé Boulah and his buddy are railing against — planning decisions just are not ‘correct’ or ‘false’, not ‘true’ or false. We are arguing about plans precisely because they aren’t ‘true’ or ‘false’ — yet. Nor ‘correct or ‘false’, like a math problem. Planning problems are ‘wicked problems’; the decisions are not right or wrong, they are ‘good or bad’. Or, to use a term that applies to all the premises: more or less plausible, which can be interpreted as true or false only for the rare ‘factual’ claims or premises, or more likely ‘probable’ for the factual-instrumental premises 1 and factual claims, premise 2, but as just plausible, or good or bad, for the ought claims, premise 3, and the ‘conclusion’.
– Okay, I go along with that. For now. It sounds… plausible?
– Ahh. Getting there, Sophie; good. It’s also a matter of degrees, like probability. If you want to express how ‘sure’ you are about the decision or about one of the premises, just the terms ‘plausible and ‘implausible’ are not expressing that degree at all. You need a scale with more judgments. One that goes from ‘totally plausible’ on one side to ‘totally implausible’ on the other, with some ‘more or less’ scores in-between. One with a midpoint of ‘don’t know, can’t decide’. For example, a scale from +1 to -1 with midpoint zero.
– Hmm, It’s a lot to swallow, all at once. But go on. I guess the next task is to make some of your ‘plausibility’ judgments about each of the premises, to see how the plausibility of the whole argument depends on those?
– Couldn’t have said it better myself. Now consider: if the argument as a whole is to be ‘totally plausible’ — with a plausibility value of +1 — wouldn’t that require that all the premise plausibility values also were +1?
– Okay…
– Well — and if one of those plausibility values turns out to be ‘less that ‘totally plausible, let’s say with a pl value of 0.9 — wouldn’t that reduce the overall argument plausibility?
– Stands to reason. And I guess you’ll say that if one of them had a negative value, the overall argument plausibility value would turn negative as well?
– Very good! If someone assigns a -.8 plausibility value to the premise 1 or 3, for example, in the above argument that is intended as a ‘pro’ argument, that argument would turn into a ‘con’ argument — for that person. So to express that as a mathematical function, you might say that the argument plausibility is equal to either the lowest of the premise plausibility values, or a product of all those values. (Let’s deal with the issue of what to do with cases of several negative plausibilities later on, to keep things simple. Also, some people might have questions about the overall ‘validity’ or plausibility of the entire argument pattern, and how it ‘fits’ the case at hand; so we might have to assign a pl-value to the whole pattern; but that doesn’t affect the issue of the paradox that much here.)
– So, Bog-Hubert, lets get back to where you left off. Now you have argument plausibility values; okay. Weren’t we talking about argument ‘weight’ somewhere? Weighing the arguments? Where does that come in?
– Good question! Okay — consider just two arguments, one ‘pro’ and one ‘con’. You may even assume that they both have good overall plausibilities, so that both have close to +1 (for the ‘pro’ argument) and -1 (for the ‘con’ argument). You might consider how important they are, by comparison, and thus how much of a ‘weight’ each should have towards the overall Plan plausibility. It’s the ‘ought’ premise — the goal or concern of the consequence of implementing the plan, that carries the weight. You decide which one is more important than the other, and give if a higher weight number.
– Something like ‘is it more important to get the benefit, the advantage of the plan, than to avoid the possible disadvantage?
– Right. And to express that difference in importance, you could use a scale from zero to +1, and a rule that all the weight numbers add up to +1. The ‘+1’ simply means that it carried the whole decision judgment.
– That’s a whole separate operation, isn’t it? and wouldn’t each person doing this come up with different weights? And, coming to think about it, different plausibility values?
– Yes: All those judgments are personal, subjective judgments. I know that many people will be quite disappointed by that — they want ‘objective’ measures of performance, about which there’s no quibbling. Sorry. But that’s a different issue, too — we’ll have to devote another evening and a good part of Vodçek’s Zinfandel supply for that one.
– Okay, so what you are saying is that, subjective or objective, we’re heading for the same paradox?
– Right again. First, let’s review the remaining steps in the assessments. We have the argument plausibility values — each person separately — and the weight or relative importance for each of the ‘ought premises. We can multiply the argument plausibility with the weight of the goal or concern in the ‘ought’ premise, and you have your argument weight. Adding them all up — remember that all the ‘con’ arguments will have negative plausibility values — will give you one measure of ‘plan plausibility’. You might then use that as a guide to making the decision — for example: to be adopted, a plan should have at least a positive pl-value, or at least a pl-value you’ve specified as a minimum threshold value for plan adoption.
– And that’s better than voting?
– I think so — but again, that’s a different issue too, also worth serious discussion. Depending on the problem and the institutional circumstances, decisions may have to be made by traditional means such as voting, or left to a ‘leader’ person in authority to make decisions. A plan-pl value would then just be a guide to the decision.
– So what’s the problem, the paradox?
– The problem is this: It turns out that the more arguments you consider in such a process, the more you examine each of the premises of the arguments (by applying the same method to the premises) and the more honest you are about your confidence in the plausibility of all the premises — they’re all about the future, remember, none can be determined to be 100% certain — the closer the overall pl-result will approach the midpoint ‘don’t know’ value, close to zero.
– That’s what the experiments and simulations of such evaluations show?
– Yes. You could see that already with our example above of just two arguments, equally plausible but one pro and the other con. If they also have the same weight, the plan plausibility would be zero, point blank. Not at all what the dear professor wanted to get from such a thorough analysis; very disappointing.
– Ahh. I see. Is he one of those management consultants who advise companies how to deal with difficult problems, and get the commissions by having to promise that his approaches will produce decisively convincing results?
– Oh Sophie — Let’s not go there…
– So the professor, he’s in denial about that?
– At least in a funk…
– Does he have any ideas about what to do about this? Or how to avoid it?
– Well, we agreed that the only remedy we could think of so far is to tweak the plan until it has fewer features that people will feel as ‘con’ arguments: until the plan -pl will at least be more visibly on the plus side of the scale.
– Makes you wonder whether in the old days, when people relied on auspices and ‘divine judgments’ to tip the scales, were having a wiser attitude about this.
– At least they were smart enough to give those tricks a sense of mystery and ritual — more impressive than just rolling dice — which some folks can see as a kind of prosaic, crude divine judgment?
– Hmm. If they made sure that all the concerns leading affected people to have concerns about a plan, what would be wrong with that?
– Other than that you’d have to load the dice — and worry about being found out? What’s the matter, Vodçek?
– You guys — I’ll have to cut you off…

2 Responses to “A paradoxical effect of thorough examination of planning pros and cons”

Feed for this Entry Trackback Address

1 abbeboulah July 8, 2017 at 12:55 pm

There was some discussion on Facebook. Some preliminary results and speculations: did we learn anything?
It has become clear (as it probably should have been at the outset) that the provocative term ‘paradox’ was not the most appropriate one for an effect that simply has to do with some people’s expectations of gaining more confidence or certainty that their decision about a proposed plan is the appropriate one, by thoroughly examining, even systematically evaluating the merit of all arguments pro and con, and the surprise or disappointment of finding that resulting measures of certainty or plausibility are not showing more decisive certainty but often less than their initial confidence in support or opposition of the plan.
The insight that the discrepancy and disappointment is a personal matter for each participant in planning discussions is not a reason to just dismiss it, since it could significantly affect participants’ confidence and willingness to engage. So the question of how to deal with the effect remains to be addressed. The question is one of personal management skill for facilitators of ‘live’ project teams of limited size; as well as an issue for the design of platforms for large public planning discourse with wide, asynchronous participation.
One procedural rule of thumb is obvious but cumbersome: that proposed plans for which the overall plausibility results hovered too close to the midpoint of the plausibility scale after discussion and evaluation, should be ‘sent back to the drawing board’ for improvement. The analysis of the assessment results can be helpful in pinpointing the precise features of the plan that have elicited negative or too low positive plausibility scores.
The second aspect relates to the qualitative difference between initial ‘intuitive’ or ‘offhand’ judgments of plausibility, and the ‘deliberated’ judgments at interim review points or the end of the discourse and assessment process. An interesting comment suggested that some people seem to attach more ‘value’ and pride in their intuitive judgment than in the result of lengthy deliberation, while others rely (and thus ‘value’) on thorough examination of all arguments. The question is, should these valuations be included in the assessment of the plan, and if so, how? An obvious argument would be that thorough deliberation comes with a cost of time, engagement, thinking etc. that adds to the cost of the eventual decision — but not necessarily to the quality of the plan.
The qualitative difference aspect led to another consideration — that of adding a kind of measure of ‘thoroughness of deliberation’ to the final plan plausibility score. Since the plausibility score itself does not indicate any of this, would complementing it with such measures relieve the discomfort about lack of increase of plausibility in the deliberated results) Generate a sense of confidence in the very effort — we have tried our best to examine all the implications of the plan’? Or should the deliberated plausibility scores be given a different label to indicate its qualitative difference from offhand judgments shot from the hip?

Reply
2 abbeboulah July 8, 2017 at 8:39 pm

What would separate ‘thoroughness of deliberation’ measures consist of? Several considerations offer themselves. One is to measure the ‘breadth’ of the discourse and analysis; for example the number of aspects/arguments examined. This is almost implied by the principle that ‘all’ significant effects or consequences of plans should be given ‘due consideration’ before making a decision. This can only mean that all concerns that have been brought up in the discussion must be included in the formation of the measure of performance judgment of the plan. In the proposed approach, this would be the plan plausibility judgment — on the assumption that the plan has been made known to all potentially affected parties, adequately described, and that all those parties have been given adequate opportunity to voice their concern. Then, an indicator that consideration has indeed been given to a concern, by a discourse participant, would be the fact of that participant having entered a plausibility and relative weight judgment to the respective argument premises, which then are made a part of the calculation of the overall plan plausibility for that participant. So the number of arguments or argument premises give such judgments would be a first candidate for that ‘breadth’ thoroughness of deliberation measure.
Of course, this will not be considered sufficient at all by people who will point out that much or most of the information about how plans and actions affect outcomes, and under what conditions, may not be known by discourse participants but has been documented or can be found by ‘ad-hoc’ research: ‘googling’ or actual investigation of context conditions, even experiments and actual observation or ‘virtual’ perception — the analysis of systems models: calculation and simulation.
The number of actual participant contributions will be finite, so as to permit claims of ‘having duly considered ALL those concerns’, — ‘complete thoroughness’. But the search for relevant documented information, and even more so the extent of ad-hoc analysis, modeling, calculation and examination of possible consequences of plan variations is open-ended. As Rittel taught: one of the ‘wicked’ properties of planning problems is that there is no obvious, natural ‘stopping point’ for the work on such problems. One can always do better. The development of adequate standards for what would count as ‘sufficiently thorough’ due consideration of potentially available information has long been a part of the enterprise of science, to its credit. But it is much more elusive for planning and policy-making. Not only because the deontic premises of planning arguments — the goals, expectations, concerns, desires and fears of people — are not nearly as predictable and fixed as standard-makers would like them to be (for example, by efforts to codify the ‘needs’ of citizens that must be met by government plans), but also because any conditions generated by the plan may well be the target of future innovative efforts to change or improve upon them — with the innovative features not known, by definition. Assuming they can be anticipated would arrogantly and falsely claim that they are already invented and known.
The other aspect of thorough deliberation is of course the ‘depth’ of consideration: the extent to which the support for the premises of each argument has been investigated and evaluated: evidence, further arguments and premises of supporting issues, data, data sources and the methods of validation used to draw inferences form those, etc.
It may be important to note that here too, Popper’s admonition to focus on the counterarguments (not only supporting evidence) is the real basis for the degree of confidence we may have: the extent to which we have done our best to find arguments of flaws with our plans, that is, salient counterarguments, and to which these have been found to be implausible or less plausible than the supporting arguments.
I suggest that the proposed approach to the systematic evaluation of planning arguments can produce adequate material for the construction of both ‘breadth’ and ‘depth’ measures of deliberation — measures that can meet the expectation for adequate confidence in decisions, even if the actual plan plausibility itself resulting from the deliberative effort is closer to the midpoint of the plausibility scale than a resounding positive or negative score.
The issue of the role of intuition has been the subject of several comments. In my opinion, all ‘evaluation’ judgments are ultimately intuitive and subjective. We can explain, for some judgments, how our judgment depends on ‘objective’ measurements of the plan’s performance, or on other support in the form of more arguments of the same kind, that also ultimately rest on premises we accept as not needing further explanation — ‘intuitively obvious’. So trying to ‘eliminate intuition’ is not meaningful. Instead, we might focus on the difference between ‘offhand intuitive judgments’ that are not based on prior experience, knowledge, analysis etc., i.e. ignorance and inexperience, and judgments that are based on internalized experience, familiarity with the subject, analysis and evidence, but not the result of the admittedly cumbersome evaluation process. We very much want our leaders — people put into leadership positions to make decsisions for which lengthy public discussion is not possible — to be able to make fast intuitive judgments of that second kind. The approach I am suggesting can provide the material for constructing measures of that kind of ‘reliable’ intuition, — even based on a person’s contributions to a thorough discussion. Whether, how, and by whom that should be done is another question for discussion, of course.

Reply

Abbe Boulah’s Weblog