Is the term ‘absolute truth’ meaningless?

Thorbjørn Mann, July 2020

Some thoughts about ‘absolute truths’, systems thinking and humanity’s challenges. An exploration of knowledge needed for a discourse that I suggest is critically significant for systems thinking related to questions about what to do to about humanity’s big challenges.  I apologize for the roundabout  but needed explanation.

‘What better be done’: absolute truth? 

There are recurring posts in Systems Thinking groups, that insist on decisions being made by focusing on the ‘right’ things’, or what better (best) be done, implying that what is ‘better be done’ is a matter of ‘absolute, objective truth’. Thus, any suggestions about the issue at hand are being derailed — dismissed —  by calling them mere subjective opinions and by repeating the stern admonition to following the absolute truth of ‘doing what better be done’, as if all other suggestions were not already efforts to do so. 

Questions about what those truths may be  are sidestepped or answered by the claim that they are so absolute, objective and self-evidently true that they don’t need explanation or supporting evidence. Heretical questions about this are countered with the question such as  “are you questioning that there are absolute truths”? Apart from the issue whether this may be a tactic by proponent  of an answer (the one declared to be an absolute truth) to get the proponents’ answer accepted,  is it an effort to sidestep the question of what should be done altogether stalling it in the motherhood issue of absolute truth? At any rate, raising questions. 

Does this call for a closer examination about the notion of ‘absolute truths’, and how one can get to know them? What is an ‘absolute truth’ (as compared to about a not so absolute one?) 

Needed distinctions

There may be some distinctions that need reminder (being old distinctions) and clarification,  beginning with the following:  

‘IS’- States of affairs in ‘reality’  versus statements about those 

There exist situations, states of affairs ‘s’ constituting what we call ‘reality’. Existing, they ‘are’. Whether we know them or not; (mostly. we don’t.) And if we know and recognize such a state, we call it ‘true’.  But isn’t that less a ‘property’ of a state ‘s’,  than a label attached to the statement, about ‘s’? About ‘s’,  is it not sufficient to simply say ‘it is’?  So what do we mean by the expression ‘absolute truth’? As a a statement about ‘s’ , it would  seem to imply that there are states of affairs that ‘are’ ‘absolutely true’ and others that aren’t? So would it not  be necessary to offer an explanation of this difference? If there isn’t one, does  the ‘absolute’ part become meaningless and unnecessary?  

So the practical use of ‘true’ or ‘false’ really refers to statements, claims about reality, not reality itself. When we are describing a specific situation ‘s’  or even claiming that it exists, we are making a claim, a statement.  When such a statement matches the actual state of affairs with regard to s, we feel entitled to say that the statement is ‘true’. Again: ‘truth’ is not a property of states of affairs but a judgment statement about ‘content’ statements or claims. 

About the claims of a statement ‘matching’ the actual state of affairs. Do we really know ‘reality’, and how would we know? Discussions and attempted demonstrations  about this tend to use simple concepts — for example: “How many triangles are depicted in this diagram?”. The simple ‘answers’ are both ‘obviously true’ (even though people are occasionally disagreeing even about those) —  but  upon examination based on different understood definitions of the concepts involved. The definitions are not always stated explicitly, which is a problem: it leads to the troublesome situation where one of disagreeing parties can honestly refer to answers based on ‘their’ definition’ as ‘true’ and to other answers  as  ‘false’ (and consequently questioning the sanity or goodwill intentions of anybody claiming otherwise). So are all those answers ‘absolutely true’ but only each given the appropriate related definitions and understanding? 

The understanding of ‘triangle’ in the diagram example may be  that of “three points not on the same straight line in a plane, connected by visible straight lines.”  There may be a fixed ‘true’ number of such triangles in the diagram. But if the definition of ‘triangle’ is just “three points not on the same straight line'”,  and it is left open whether the diagram itself intends to show a plane or a space, the answers become quite different and even uncountable (‘infinitely many, given the infinitely many points on a plane or in a space depicted by the diagram, that exist in triangular position relative to each other).

The term ‘depicted’ also requires explanation: does it only refer to triangles ‘identified’ by lines connecting three selected points, lines drawn by a color different from the color of the ‘plane (or space) of the diagram? If drawn by the same color, are they n o t  ‘depicted’? Do the edges and corners of the diagram picture ‘count’ as ‘depicting’ the lines and apex of a triangle, or not?  So even in this simple ‘noncontroversial’ example,  there are many very plausible answers, and the decision to call one or some of them ‘absolute truth’ begins to look somewhat arbitrary. 

Probability

The label ‘true’ or ‘false’ apply to existing or past states of affairs. Do they also apply to claims about the future (that is, to forecasts, predictions),   The predicted states of affairs  are, by definition, not ‘true’ yet. The best we can do is to say that such a statement is more or less ‘probable’: a matter of degrees we express by a number  from 0 (totally unsure) to 1(virtually certain) or by a ‘percentage’ number between zero and 100. 

Actually, we usually are not totally certain about the truth even of our claims about actual ‘current ‘ or ‘always’- states of affairs. We find that we often make such claims only to find out later that we were wrong, or only approximately right about a given situation. Even more so, about more complex claims such as whether a causes b  and whiter it will do so in the future. But it is fair to say that when we make such claims, we aim and hope to be as close to the actual situation or effect as possible. Can we just say that we should acknowledge the degree of certainty — or ‘plausibility’ — of our statements? Or acknowledge that a speaker may be totally certain about their claim, but listeners are entitled to have and express less certainty — e.g by assigning a different certainty, probability or — I suggest –‘plausibility’  to the claim? Leaving a crumb of plausibility for the ‘black swan’?

‘OUGHT’ claims and their assessment:  ‘Plausibility’ rather that ‘truth’ 

For some other kinds of claims, the labels ‘true’ or ‘false’ are plainly not appropriate, not even ‘probable’. Those are the ‘ought’-claims we use when discussing problem situations (understood as  as discrepancies between what somebody considers to be the case or probable, and what that person feels ‘ought’ to be the case). The state of affairs we ‘ought’  to seek ( or the means we feel we ought to apply to achieve the desired state) are– equally by definition — not ‘true’ yet.  So should we use a different term?  I have suggested that the label ‘plausible’ may serve, for all these claims, expressed as a number n (for example ‘1’) between -n (totally implausible, virtually improbable or the opposite being true) and +n (virtually certain)  with the midpoint zero denoting ”don’t know’, ‘can’t tell’.  Reminder: these labels express just our states of knowledge or opinion, not the states of affairs to which they refer: we make decisions on the basis of our limited knowledge and opinions, not on reality itself (which we know only approximately or may be unsure about). 

How can we gain plausibility of claims? 

The question then is:  How do we get to know whether any of these claims are ‘true’  or probable, or plausible, and to what degree? Matching? Or: — since we can rarely attain complete certainty (knowing that there can be ‘black swans’ to shatter that certainty) — how can we increase our degree of plausibility we feel we can attach to a given claim.? What are the means by which we gain plausibility about claims? Possibilities are: 

1)  For ‘fact’-claims: 

1a) Personal observation, experiments, measurements, demonstration, ‘tests’. 

1b) Inference from other fact-claims and observations, using ‘logically valid’ reasoning schemes;  

1c) From ‘authorities’: other persons we trust to have properly done (1a) or (1b), and can or have explained this;

1d) Declaring them ”self-evident’  and thus not needing further explanation. 

2)  For ‘ought- claims:

2a) The items equivalent to (1a) obviously don’t apply:  So: Personal preference, desire, need, accepted common goals or ‘laws’

2b) Inference? The problem here is that inferences with ‘ought  or what I call ‘planning arguments’ — claims are inherently not (deductively) ‘valid’ from a formal logic point of view and because the label ‘true’ does not apply. However: for some of the factual premises in these arguments, reasons (1) will apply and are appropriate.

2c) From authorities:  Either because they have done 2a or 2b, or because they have social status to ‘order’, command ought-claims?

2d) ‘Self-evidence’?  For example: ‘moral norms’? Laws? 

Is ‘self-evident’ equal to ‘absolute’?

We could add claims about ‘meaning’, definition etc. as a third category. For all, is the claims of ‘absolute truth” equivalent to ‘self-evident?  It is the only one for which explanation justification, evidence is not offered, even claimed to be impossible, unneeded. What this means is:  if there are differences of opinion about a claim, can the proponent of such a claim expect to persuade others to come to accept it as theirs?  What if both parties should honestly claim / believe that theirs is the absolute truth? Claiming ‘absolute truth’ or ‘right’ or ‘self-evidence’ is not  a good persuasion argument, but if repeated sufficiently often (brainwashing) surprisingly, effective, history tells us.  If justification (e.g. by demonstration) is attempted, it turns into one of the other kinds.

So, for all these claims and their ‘justification’ support, different people can have different opinions (different plausibility degrees). This is all too frequently observed, and  the source of all disagreements, quarrels, fights, wars. The latter item (war) suggest that there is a missing means for acquiring knowledge: the application of coercion. force, violence, or in the extreme, the annihilation of  persons of different opinions. The omission is based on the feeling that  it is somehow ‘immoral’ (no matter how frequently it is actually applied in human societies, from the upbringing of children to ‘law enforcement’ and warfare).  

The need to shift attention to ‘decision criteria’ and modes acknowledging irreconcilable differences of opinion

There is, for all the goodwill admonished by religious, philosophical and political leaders, the problem that even with ample efforts of explanation and offering exhortation, reasons, arguments, definitions, situations may occur where agreement on the claims involved cannot be achieved — yet the emergencies, problems, challenges demand that ‘something must be done’. 

What this means, in my opinion, is that the noble quest for ‘truth’, probability, even plausibility as the better guide for community, social decisions — ‘solution’ criteria — making decisions based on the basis of the merit (value, plausibility) of contributions to the discourse about what we ought to do  (that we ideally would all agree on!) must be shifted to a different question: what criteria can we use to guide our decisions in the face of significant differences in our opinions about the information supplied in the discourse? The criteria for evaluation of quality, plausibility of proposed solutions  should be part of but are not the same as the criteria for good decisions.  It is interesting to note that the most common decision mode – voting — in effect dismisses all the merit concerns of the ‘losing’ minority. Arguably, it should be considered a crude crutch to the claim of ‘democratic’ ideals: equality, justice, fairness to all;  But also, that the very crisis cry ‘”Something must be done” is often used as an exhortation tool to somehow generate ‘unity’ of opinions. 

Issues for Systems Thinking

I suggest that this is an important set of issues  for systems thinking. Systems Thinking has been claimed to offer ‘the best currently available foundation for tackling humanity’s challenges. But has it focused its work predominantly on the ‘IS’ questions of the planning and policy-making discourse, rather than on the ‘ought’ issues? On better understanding of the (existing) systems in we will have to interfere? On better prediction of different plan proposals’ future performance (simulation)? Sure, those tasks are immensely important and the work on these questions admirable. But are they the whole task? 

As far as I can see, the other (‘ought’) part of planning and policy-making work — both the development of a) better evaluation, (development of measures of the merit of planning discourse contributions leading to ‘solution merit’  criteria) and b) the development of better criteria for planning decisions, in the face of acknowledged disagreement about the merit of information contributed to the discourse are at best still in the embryonic state. Systems thinking appears to many (perhaps unfairly so)  as suggesting that decisions should be based on the assessment of ‘facts’ data alone, ignoring the proper assessment of ‘ought’ claims and how they must be combined with the ‘facts- claims to support better decisions.   

The development of a better planning discourse platform

Of course, the ‘discourse’ itself about these issues is currently in a state that does not appear to lead to results for either of the above criteria: the design of the discourse for crafting meaningful decisions about humanity’s challenges is itself an urgent challenge. If I had not convinced myself, in the course of thinking about these issues, that ‘absolute truth’ is a somewhat inappropriate  or even meaningless term, I would declare this a main ‘absolutely truth and important’ task we face.  

–o– 

On gratitude for being shown the true extent of problems?

On a Steaming Hot Midsummer’s Day In the Fog Island Tavern

– Trying to make some breeze with your head wagging, Abbé Boulah? Not sure that’s very energy-efficient?

– I agree, — but your tropical ceiling fans don’t quite do the job either, Vodçek. And I can’t get myself to put ice in my Zinfandel to cool myself from the inside. But no, it’s not even the heat here that’s making me wonder about the state of things. Though it definitely has to do with hot air. Of the political kind, that is.

– Hmm. Maybe I should  renew my old rule about political talk in here when it’s this hot. So what’s in that little blue book you’re studying that’s creating that strange attitude in you? 

– Well, it’s the U.S. Constitution — have you ever actually read it? With all the hot air being blown around all over about it, I thought I’d take another look at it. 

– You, a damn furriner? Because I don’t think you’re a citizen yet, just a green card guy, are you?

– Right, Renfroe. See,  I never could get myself to assert the required degree of allegiance to that document. Allegiance expected when you voluntarily take on a new citizenship. It’s not like when you’re born here, you’re a citizen, subject to the rules and Constitution, willy-nilly. Nobody asks you as a kid, when you’re made to stand up and swear allegiance to it, if you’ve even read and understand it. What would happen if you said:  “Wait, there something I don’t really understand and agree with here, so swearing allegiance would be, well, a lie…”?  If you did it as an adult about to become a citizen, because would it be right to start that new life with a lie: easy, you don’t get to be a citizen. So you don’t apply, and don’t get to say things like that. But kids? Even adult citizens?  

– There is a process where we can make amendments to change it, isn’t there? 

– Yes, I know. But it’s a long process, takes a long time even to get to a vote. And what if it doesn’t pass?  If you argued for an amendment and it lost, are you now an enemy of the constitution, of the interior sort,  against whom citizens are supposed to defend it? 

– Oh boy, I never thought about that.  So what are the things in the Constitution you don’t agree with? 

– You’re asking me, Renfroe?  Me, the damn furriner, who doesn’t  really understand the Constitution and isn’t allowed to join the discussion about changing it? 

– Well, are you against it, then? 

– No, Vodçek. On the contrary, I have always considered it a major achievement of humanity and a model for many other countries. But look at some of the weird things that are now developing! There must be some not so perfect things about this Constitution, if those things are possible and allowed under it?. So I was just curious about what all the hot-air-hubbub in the current political discourse is about, that involves the Constitution in one way or another. Wondering why people, on all sides of the political divides, don’t start talking about what could or should be changed in it to avoid some of the strange things to happen that arguably are, well…

– Unconstitutional?  

– Yeah! By all the Wall Street Bull’s Excrement! That’s the word! Detrimental, dangerous in the long run,  as well as powerfully ill-smelling. 

– Is it just your non-belief showing, my friend? Even hate? 

– Well,, you must admit some unbelievable people are getting away with unbelievable stunts.

– But they’re not getting away, looks like.  They’re here to stay, at least for four more years, if not more…

– You’re shrewdly tiptoeing around the question: what makes that possible? That’s what I want to know.

– Oh, I think there’s a good explanation: If you put out many contradictory tweets, you give all the true believers the freedom — freedom, isn’t that the big constitutional thing? — the freedom to pick whichever one to believe in?  And act on it?  Isn’t that the great MAGAIC? That we all should be thankful for? 

– Sounds great, until you’re branded and treated as a traitor, told to leave the country, if you don’t like what it’s becoming,– for believing in one the great MAGAIC master feels threatened by. Coming to think of it, the country that was stolen and cheated away from the true Americans in the first place? — So that nobody even starts to think about steps that might be taken to prevent those unconstitutional things?  It seems the good citizens of the country aren’t quite awake and sufficiently worried about these things, to start some serious thinking about that. 

– Well, there are plenty of groups out there clamoring for change, aren’t there? Nonbelievers, traitors, the lot of them, I say…

– Sure, Renfroe.  Even more questionable things are promoted under the banner of needed change, even if the change is represented at going back to some ‘real’ or true interpretation of the Constitution. What gets me:  it all boils down to campaigns to defeat this or that candidate for office, or the counter-candidate. All the ads and emails we get are just appeals to contribute money for the campaigns. About winning elections, gaining power. Very little if anything seems to get to the substantive issues that make the bad things happen, let alone making them better. Can they be fixed with just some different guys in the various offices? When the underlying structural conditions still will lead to the same bad developments that people get upset and angry about?  

– Good questions. So you think the country needs to wake up to see the need for real changes? 

– I do, Vodçek. I know all the unrest and the breathless media look like you’d want to calm things down rather than more ‘waking-up’. But is it making any genuine difference? 

– Hmm. So let me ask you a hypothetical question. Say you were in charge of things. What would  wise old  y o u  do to wake up the country to see the seriousness of the weaknesses in the Constitution? 

– Good question. And sufficiently hypothetical to avoid violating your no-politics  in the Tavern rule?

– We’ll see. Let’s just say I’m not paying close enough attention for a while. 

– Okay. Let’s pin your hypothetical assumption down first — that I’d really be somehow in charge to get things done.

– That’s hyper hypothetical indeed, I agree. Hyperthetical. But why?

– Well, from what I can see, all the protesting and well-meaning commentary hasn’t been very helpful, so far. Are tings getting worse rather than better? So I think a very different tactic is needed. Actual demonstration, real action, perhaps even painful lessons. 

– Hmm. All right, we’ll put you in your hyperthetical charge. So what’s your first step? 

– It’s not a well-ordered step-by step sequence. Things done simultaneously. To be effective,  it needs distraction, confusion. But of course,  explaining it has to start somewhere, and go on in a sequence. Don’t confuse that with the actual process!

– Duly noted. 

– So one thing I’d do is to refuse to make any of the usual disclosures required by law: tax returns, financial holdings and such. That’ll keep a lot of people aggravated and busy with ‘investigations’ of various kinds. It would even let me make a lot of money while that is going on. And I’d not even hide that. Let people get jealous! Meanwhile, I’d stuff the courts and important government positions with people who will do what I ask them to do.  Anything. I’d keep firing and replacing them if they don’t. If there are complaints about that, invent details from their work that lets me call them traitors or criminal incompetents. 

– Hear that, Renfroe? So what will those people do?

– Good question. While I’m putting out silly controversies for the headlines every day, they must make as little noise and get as little attention as possible, while they are relaxing or eliminating a lot of regulations, things like environmental protection bureaucracy rules that hamper certain industries or reduce their profits. So most people won’t realize it until the consequences become obvious — that’ll take some time, right? Obviously, those industries will support my policies and campaigns promoting ‘economic growth’. So while all those personnel changes are represented as efforts to combat corruption, they actually raise a smokescreen for intensified, let’s call it facilitated merited compensation for activities and contributions to the mission. 

– Corruption, in other words. Won’t the media raise a ruckus about  that?  

– That’s a harsh and unfriendly word, we’d have to keep the media from using it. We reserve that for when we talk about the opposition. But yes, that’s the point. Now consider: every issue has ‘counterarguments’:  small aspects that can be exaggerated into threats to national security or economy, hyped up to get my base supporters firmly convinced that I’m saving the country from disaster or evil conspiracies. But eventually seen as what they are. 

– You seem quite optimistic about that?  

– Yes, because the rising inequality and injustice of it will become too obvious. But until then, I’d use those issues to paint the media as part of the evil conspiracies, as traitors, tools of unpatriotic groups or parties that only seek power. Which of course greatly increases my power; especially if I can get the owners of the media to keep their journalists on a short leash. There are ways to get that done, you know. Not by me: by others. I’d have a lot of help doing that. 

– Let me guess: the powers that have bought the media are holding the leash.  

– You’re catching on. That’s one part the Constitution doesn’t deal with well:  it didn’t anticipate that economic powers could buy both government and the media. I’d let them run with that, while getting things in place for a real power grab. I don’t even offer reasons for those things, just do them. Like killing off the post office to make voting by mail impossible: it’s just necessary to keep those extreme wing elements from taking over the country, you understand. 

– You mean the extreme left wing guys? 

– Let wing, right wing: did it never occur to you that right or left depends on where you’re looking from? See, when I talk to the Senate, the so-called left wing folks are sitting to my right, and I accuse them of many of the things that the right wing folks to my left are actually doing…

– I guess you’d have such a devious a game plan for the police and law enforcement too?

– Of course, glad you mention it. But those levers of power enhancement are already well in place and only need to be further cultivated to become fully aligned with my intentions. Look at the very term ‘law enforcement’! What does it tell you?

– Of course, there has to be a way to enforce the laws. 

– Yes. Everybody accepts the notion that ensuring that laws are upheld and violation must be prosecuted and penalized, and that it requires force. Greater power and force than any would-be lawbreaker, of course. Naturally. By definition. 

– You can’t argue with that. 

– See? That kind of lack of imagination would make it easy for me. But equally inevitably, it creates escalation. For example: if you ease the hurdles for everybody — including organized and disorganized crime — to get access to more powerful weapons, doesn’t it stand to reason that the law enforcement agencies  m u s t  be equipped with even more powerful equipment?  That’s the box people can’t think themselves out of.  So I’ll provide that, and encourage them. Then, criminals as well as the second-amendment militias counter that with more effective gadgets. So give the police military-type weaponry. It needs to get used up, anyway, to maintain the economic growth of the industries producing new stuff, see?  I’ll use every little confrontation or mis-step to increase the perceived need for more  power, even to bend the rules if needed. Until they become so powerful — but loyal to me — that there’s no viable opposition left that could threaten my power. And no bars to the temptations of abusing the power. That’s a natural law, if history tells us anything. 

– You don’t think all the folks who insist on their second amendment rights to have and carry weapons are going to start trouble about that? 

– Are you kidding? Tell them it’s their right, their power!  It’s even a bit ironic, isn’t it? Their very support is what ‘forces’ me to create those superior law enforcement and military forces, that ultimately will make their pathetic excuse ‘to protect them against the government’ the contradictory illusion it is. Subterfuge for selling more guns. To finally become so obvious it can’t be sustained. Admit it: people just like to shoot guns.  And, at least some folks like to kill with them. Some of them  like violence, destruction.  Killing. So I encourage the emergence of different factions and groups — but turning them against each other, rather than the government — while assuring all of them that the government will protect  t h e i r  particular groups. 

– Fascinatingly devious, I admit. 

– Yes. The gun issue is just one of the best examples of how the Constitution can be interpreted in so many different ways as to make all kinds of devious machinations possible and apparently ‘legal’, while causing considerable trouble. There are so many different areas — of immoral enrichment, corruption, outright crimes, to break laws and constitutional provisions that one can get away with;  the point is to use all those ‘loopholes’ to make the abuse obvious.

– Hmm. I don’t know. You really think that making all that so obvious will, as you say, make the people ‘wake up’ and take action? It actually scares me to think about what those actions might be. Getting a feeling it actually might be too late already, to avoid either chaotic unrest, or decline into, well —

– Don’t say it, Renfroe: It would get too seriously political: Vodçek is already breathing heavily, ready to cut us off.  

– Don’t tempt me, my friend. You’re getting close. Hmm. Why does all that somehow does sound eerily familiar? Can you at least tell us what remedies you have in mind for fixing the flaws in the Constitution that allows all that?

–  That’s just it: there’s not enough thinking going on about that. That’s what we need to get started! And shouldn’t you all be grateful for my opening your eyes about it? 

– Ah. Yes, I can see why you’d insist on gratefulness and loyalty. 

– And that’s why I’d need four more years, don’t you see? 

– Good grief. You’re making my head spin. 

– Don’t say it, Vodçek:  revolve? 

– Okay, that’s it. You’re cut off. 

–o–

EVALUATION IN THE PLANNING DISCOURSE – SUMMARY

An effort to clarify the role of deliberative evaluation in the planning and policy-making process. Thorbjoern Mann, May 2020.

INSIGHTS / IMPROVEMENT SUGGESTIONS: CONCLUSIONS?

The two dozen blogposts over past few months try to explore the many facets of deliberative evaluation as it relates to the planning discourse. Necessarily, the issue-by-issue treatment does not do justice to all the connections and relationships between them. Many questions that call for more exploration, testing and research were raised, but of course not resolved. Faced with public planning tasks today, we always have to make decisions based ‘on the best of our current incomplete knowledge’, so it seems appropriate to try to summarize that current state of knowledge. What can be learned from this exploration? The following notes highlight a few insights for discussion.

No ‘universal’ common approach

The first answer to this question may sound disappointing: There are so many different attitudes, perspectives, situations and tasks involved in planning that any suggestion of a ‘standard’ common approach or procedure would be rather inadequate to the specific conditions of each case, in one way or another. Therefore, it would be pointless to try to make general recommendations about details for specific approaches. They would be of the kind of ‘if approach or technique X is used, it should be done with specific details x1,x2, etc.” Those recommendations should be included in the specifications for individual techniques in the tool kit. The only meaningful ‘general’ rule should be to coordinate those agreements with tools used in other phases of a project, for the sake of consistency and avoiding confusion due to too many different jargon terms and rules.

Critical issues

In the course of the discussion, the initial set of issues calling for discussion had to be revised. Some questions emerged as more controversial and difficult to reconcile than others. They involve what seem to be fundamental theoretical objections to systematic ‘methods’ of evaluation, or simply efforts to sidestep the question since it is seen as unnecessary cumbersome addition to planning project. The first of these positions rests on the confidence that a valid theory applied to the process of generating planning or policy proposals will make evaluative scrutiny unnecessary; the second on belief in such concepts as ‘wisdom of crowds’, or the superior ability of intuition — of policy developers, of participants in the discourse, or of ‘leaders’ making the decision.

A related question arises from the practice of a number of ‘management consulting’ approaches that rely on facilitator-guided small group events aiming at consensus or consent decisions or recommendations for single solutions generated by a theory (such as the Pattern Language) or orchestrated discussion. Such groups usually consist (in the case of organizations contracting with an outside consultant firm) of selected company employees with special skills or detailed knowledge of the problems to be remedies. The ‘decisions’ reached then become recommendations to the organization’s management. If such approaches are suggested for larger public projects, they would take the form of ‘expert’ panels informed of the public’s concerns through surveys or interviews but reaching their recommendations in the small group discussions but usually not involving any formal systematic evaluation procedure. To allow for a greater degree of public participation, it would become necessary to construct a hierarchical structure of small face-to-face group ‘circles’ (to adopt the vocabulary of one such approach. Each higher hierarchy level of circles consists of representatives of the lower circles, using the same approach or facilitating mode to process the results of the lower circles into recommendations for the respectively next higher level. This problem constitutes one of several strong arguments for an ‘asynchronous’ online but ‘flatter organization of the discourse, which is precisely the aim of the overall ‘public planning discourse support platform’ for which new forms of discourse orchestration and decision-making are needed.

The map of critical issues in the diagram resulting from these insights had to be revised, showing the elements of evaluation on one side and the issues arising from different views about the role of evaluation in the planning process on the other.

Figure 1 — Issues and Controversies, Revised

Embedding a ‘toolbox’ of specific techniques in an overall framework

The needed systems of technological support of a general planning discourse platform or forum with wide ‘asynchronous’ public participation for larger projects will have to adopt some common assumptions, agreements, and vocabulary. Some such agreements are of course needed for any small or large project, whether based on F2F interactions or not. Any platform will imply some such agreements, and this poses a significant challenge to its design: to keep a delicate balance between those necessary agreements and the need to accommodate different views even about such initial provisions. One key lesson from the exploration is that there is a large variety of perspectives on which agreements would rest. The platform should not impose one such perspective but must remain flexible, open to the variety of views and preferences participants may bring to the table. It should focus on reaching common agreements for each project, as an integral project task, based on decisions by each project’s participants.

The overall framework must therefore be as simple and inviting to potential participants as possible. As people become more familiar with the platform, it can then offer guidance and opportunities for selecting special techniques and methods from a ‘tool kit’ collection techniques and tools to facilitate in-depth analysis and evaluation of the particular issues arising in different projects, as needed in the perception of participants. The choices should include the option of reaching recommendations and decisions without any explicit systematic deliberation. This, of course, raises questions about what would make decisions legitimate and compelling for the affected populations, and what responsibilities or ‘accountability’ provisions it would raise for the respective decision-makers. The idea of using the ‘currency’ of ‘discourse merit points’ to require decision-makers to pay for decisions begins to address this issue.

Figure 2 — A ‘basic’ (neutral’) planning process with evaluation as an optional ‘toolbox’ element

Procedural agreements and process

The need for flexibility can be accommodated with the provisions for the procedures to be followed to reach a decision. The diagram below shows one example of a basic framework, drawn from the tradition of parliamentary procedure that will be familiar to most people in countries with parliamentary-type governance. The key feature is the ‘Next step?’ motion that can be raised at appropriate times during the discourse, that can call for a decision, etc. but also for the implementation of a ‘special technique’ for more thorough analysis.

Translation services language-language and disciplinary jargon to conversational language

Many problems facing humanity today are ‘international or even ‘global’, with affected parties living in areas governed by many different government entities, speaking different languages. Thus, a general platform for the treatment of such projects must provide adequate translation services between different natural languages, as a matter of course. But since the discourse will draw on scientific and professional knowledge from many disciplines (and consulting firms), it will also need ‘translation from the ‘discipline jargon’ of the contributing experts.

The argument against ‘argumentation’ as unnecessarily ‘argumentative’ and adversarial

The investigation was largely motivated by the initial question of how to evaluate the merit of arguments in what Rittel called the ‘Argumentative Model of Planning’. The objections against the very word ‘argument’ can of course be dismissed as misunderstanding the meaning of the term: It is not ‘fighting word’ implying a basically adversary attitude but an offer to reasoning, — a reason –showing how a position either for or against a proposal will ‘follow from’ or can be supported by premises that the audience already accepts or will come to accept upon being show further evidence.

The existence of such misunderstanding must be acknowledged as a potentially significant and destructive factor in the planning process. I have suggested to clarify the distinction with a different abel such as ‘quarrgument’ for the kind of exchanges leading to adversarial-only verbal or physical ‘quarrels’. But a better option is perhaps to avoid the term entirely, with a provision to immediately replace an argument (if one is entered into a discourse) with the questions about the premises used. Instead of the ‘argument’ version of an entry like:

“Plan A will cause effect B, given conditions C,

B is desirable, and

C will be present”

(which may be ‘stored for reference in the ‘Verbatim’ record of the platform), the displays for the assessment will show the questions:

– Will A cause B, given conditions C)?

– Should effect B be aimed for? and

– Will conditions C be present?

The aggregation into argument plausibility, argument weight, and plan plausibility follow the same steps as those shown in the section of planning argument evaluation but skip the display of arguments plausibility and weight, to hide the controversial term.

The ‘subjective judgment versus objective fact and measurement’ controversy

The discussion of evaluation here cannot offer a ‘resolution’ of the controversy whether design and planning decisions should be based on subjective (intuitive) judgments or objective (‘rational] measurement-based ‘facts’, nor how to distinguish between these kinds of judgments. The recommendation is — for the time being and for the sake of effective process in given practical situations — to leave the controversy aside. Instead, whenever there arises a situation in which a decision-maker is asked to or claims to make decisions ‘on behalf’ of other affected parties — to call for the mutual explanation of the respective bases of judgment: explanation to the satisfaction of the other party, not to some theoretical standard or expert opinion. This may shift the issue to the realms of general research, education or public information. It may be in need of research and clarification in those domains, — but cannot be settled separately in the area of planning.

Claims of validity of planning and decision-making methods

The investigation of evaluation in the planning process was motivated by a sense that the planning decision-making process is in need of improvement (especially with respect to evaluation) and a sense that some improvement is possible. This should not be taken as a claim of being a ‘more perfect’ approach. Rather, the insights from the review suggest that such claims would be pretentious and inadvisable. As just one example, consider the expectation that a planning decision should be based on ‘due consideration’ (and thorough evaluation) of ‘all the pros and cons’ about a planning proposal, as a leader may solemnly promise. It may seem plausible at first sight, but it was seen that it is difficult if not impossible to be certain that ‘all’ those arguments — all potential evaluation aspects — have been or even can be identified. From a systems modeling perspective, the question of the proper (acceptance) of the boundary of the system at hand, is a matter of the system modeler’s judgment more than the system’s ‘true’ properties. The pressure to justify model assumptions with data leads to a preoccupation with past data and measurable variables, over future unknown possibilities, new research knowledge and subjective motivations.

Argumentation as practiced in ‘parliamentary’ discourse predominantly deals with ‘qualitative’ effects: an argument that ‘plan x will achieve a precise quantitative outcome y of variable v in a specific time frame t ‘ is not nearly as plausible as the general but vague qualitative version that ‘Plan x will, in time, improve things with respect to v’. And the qualification of planning arguments ‘given circumstances or conditions C’, if taken seriously, will call for an interminable systems analysis of the arguments’s complex context. Realization of such interminable complexity will quickly nudge participants to end more thorough scrutiny of these questions: understandable and perhaps even defensible, but not justifying claims of ‘perfect’ method. But should such questions arise — and they arguably should sometimes be encouraged — systems modeling and data analysis, diagrams and visual mapping can enhance participants’ understanding, and should be offered as needed in the discourse. By the same token: the possibility of systematic assessment e.g. or arguments will necessitate weeding out repetitive entries in displays and worksheets to the discourse, which can improve overview and understanding.

A further warning to avoid ‘obvious’ confidence in premature judgments must be seen in the many different forms of aggregating both personal and group judgments into decision guides or indicators — they should not even be called and misused as ‘decision criteria’.

Requirements for acceptance: training, education

Even with the best efforts for making the basic framework as simple and understandable to lay participants as possible, the variety of possible attitudes, expectations, assumptions and corresponding techniques and tools raises the question of accessibility for as many segments of communities as may be affected by planning projects and the problems they aim to address. How can the average person comfortably participate in the planning discourse if the concepts, language, tools and needed procedural agreements are unfamiliar and thus confusing? Even this ‘average’ expression is ‘wrong’: don’t crises and emergencies tend to affect and hurt poorer, less educated people more than even the ‘average’ members of the community? But it is the information of those people that is needed to properly address their concerns.

Traditionally, a main task of public education is to prepare citizens for the planning and political discourse. It is not likely that the needed understanding and skills required for even basic participation in the kind of online asynchronous planning and policy-making process sketched out in the proposed planning discourse platform and its provisions for evaluation are offered by current education systems. And the prospect of getting the bureaucracies of all the world’s education systems, whether public or private, to include this material in its curricula itself looks like a planning project of unprecedented magnitude and complexity. So it seems that the task of education and training all potential users as well as the needed staff for the platform calls for radically new approaches. Would an online ‘planning game’, based on a simple version of the process, run on cellphones that are increasingly available even in poor communities be a better step towards this task? (This idea was tentatively explored in a paper on Academia.edu). The challenge of education and training itself might be the first project serving as the necessary test case and experiment, fueled and funded by not much more than all the consultant’s competitive desire to have their approach included in the ‘tool kit’ of a simple common overall platform and process.

–o–

The Missing Concern: It’s About Power, Systems Thinkers!

From the Fog Island Tavern

……………………………………………….Discourse Bog-Hubert Thinking (of some kind)

– Say, Vodçek: What is Bog-Hubert doodling over there? So engrossed in his bubbles… Not even noticing his coffee getting cold?

– Yes, Sophie, I’ve been wondering too. He does get intense trying to think sometimes. Must be some bug Abbé Boulah put in his ear: they had a long discussion here about something a while ago. Did you hear what it was about, Dexter? Sitting closer to them, weren’t you?

– Yes, but I didn’t understand what it was about. Sounded like some new economic or governance system they are trying out over on the Rigatopia rig. Made me curious all right, but I wanted to wait to ask them to explain until they’d worked it out. But then Abbé Boulah had to go somewhere, left Bog-Hubert stewing in his bubble network, lost in his system loops. So I guess it’s not fully cooked yet, what he’s stewing over.

………………………………………………………………………….Throwing Out the Old System?

– Systems, huh? About government? It’s about time we got some systems thinking into that.

– Come on, Vodçek: Isn’t the government the very system that’s gotten so rotten it really should be replaced?

– Why Sophie, I am shocked. Have you gotten into the throw-out-the system crowd too?

– I don’t know about that, Vodçek. I just hear so much talk about ‘new system’ this, ‘new system’ that, — if the system is so bad, why do we need more ‘systems thinking’ to fix it?

– So you are into the new ‘awareness’, the holism, Gaia, the WE not ME movements? Don’t they all want to throw out the current ‘system’? Even the folks who are ranting against BIG Government, even as they are running like crazy to get to run it?

– Don’t throw all that into the same trash bin, Vodçek. There are differences: ‘throwing out’ is one thing, but what some of those people are talking about is ‘system transformation’. And I don’t think you’ll deny that there are some things that are very wrong in the current ‘system’ or whatever you want to call it?

– Okay, should we try to sort this out? Maybe professor Balthus — just coming in there — can help? Good morning, professor.

– Good morning. Help out with what? If you can get me some coffee, Vodçek, and explain your conundrum, I’ll try.

………………………………………………………………………………………………………..Conundrum?

– Well, here’s your coffee. Now, I don’t know if it qualifies as conundrum. It is a little strange. We are seeing Bog-Hubert over there, uncharacteristically oblivious to anything else around him, working on some diagram. We suspect it’s about a system of some kind — he’s already used up four of my napkins. And while we were speculating about what it might be, the unsavory issue came up about all those movements that are calling for systems change, new systems, awareness and throwing out the old system etc. Did I state that to your satisfaction, Sophie? Dexter?

– You left out ‘system transformation’, Vodçek.

– Sorry about that. Okay, Transformation, too.

– I see. No, wait: I don’t see. What’s the problem?

…………………………………………………………………………………………Problem Embryonics

– Ahh yes, the problem. There seems to be an embryonic but, I suspect, fundamental disagreement: Calls for more ‘systems thinking’ clashing ominously with calls for throwing out the system, and all the thinking associated with it. Getting close to a quarrgument.

– Oh brother. A systembryonic quarrgument? Calls for more coffee, make sure I’m really awake yet. Well, I agree that ‘the system’ they are complaining about has some, let’s say, inherent problems. But I confess that I have gotten tired, in my old days, of all those calls for throwing out the system, in whatever new ideological or spiritual getup or camouflage?

– Why is that?

……………………………………………….’System Transformation’ — Or ‘Regime Change?

– Oh: Take a look at history. Some older and recent experiences with system overthrow, for example. Revolutions. Some of those were motivated by ‘systems overthrow’ of the ‘new system’ kind. Others were acclaimed, if not secretly or openly supported, by folks who would arguably be considered by the former as representing the ‘old system’ — as just a different system. They tended to call it ‘regime change’ — even if the ‘transformation’ turned out just as revoltingly bloody and disgusting as the revolutions of the first kind. So is it necessary to take sides, to recognize that in too many cases the outcome was strikingly similar?

– What do you mean?

– Well, look at what happened! Yes, they got rid of some nasty people. Tyrants, dictators. Installed ‘democracy’, perhaps, or some regime based on religion. With new leaders, the heroes of the revolution? Or relics from some still earlier old regime two or more revolutions back, bringing back the oh so good old days? Either way: a few years later: they look suspiciously like just another oligarchy or dictatorship. And the calls for throwing out the rascals starts all over. Do we ever learn?

– Huh.

……………………………………………………………………………………..What Are We Missing?

– So what’s the lesson there? What are we missing?

– Good question, Sophie. And I don’t see it being asked — it’s asking what’s missing from both, from all the old and new systems, what’s the common flaw?

– Why is that, professor?

– If I knew the right answer to that, would I be sitting here letting my coffee get cold? Even to that one, there are several hypotheses, theories. You have one, Vodçek? Or you, Commissioner? I see you both twitching.

– Yes. Isn’t it obvious? The people where we tried to help getting democracy, they’re just not ready for it. They need strong leaders, but they don’t know how to elect the right ones.

– Ah, you mean US?

– No domestic politicking here, folks, or I put diuretics in your coffee and send you off into the poison ivy brambles outside to pee.

– Trying for a strongman stature, are you, Vodçek? Getting with those trends?

– Seriously, guys: is this a matter for bad jokes?

– Okay, Sophie: what’s y o u r theory? Spiritual awakening? Prayer back into the schools? Or closing public schools and leave education to the churches, synagogues, mosques? Pagan full moon dances in the steaming jungles of North Florida?

– That’s it, Vodçek: You a r e using the strong man tactic to scare us out of our wits!

– Well, professor: I assume you’re going suggest a stronger role of science in governance, aren’t you?

………………………………………………………………………………..Detour: Science in Charge?

– Wouldn’t hurt, but if you expect me to argue for scientists to trun government, to become the great leaders, no. No philosopher-kings either, much as I hate to get into quarrels with Plato fans.

– Didn’t we discuss this issue some time ago here, about how designers, planners, and I assume government leaders should take a lesson from that science rule about hypothesis-testing?

– You mean Abbé Boulah’s adaptation of Popper’s refutation rule?

– Yeah, that’s the one — let’s ask Bog-Hubert about that, he knows Abbé Boulah better. Bog-Hubert: can you take a break from your doodling?

– Yes, yes, Sophie, I heard that, you guys were starting to raise your voices. What Popper said was something like this:

“We are entitled to tentatively accept a scientific hypothesis

(he means some speculation about how the world works)

to the extent we have done our very best to test it —

which means to find evidence — to show that it is wrong, —

and it has survived all those tests.”

Wasn’t that it, professor?

– As far as I remember, yes.

– So even in science, it’s still tentative, no certainty?

– Right, Sophie. Halfway. Maybe we can be certain that when we observe a black swan, the hypothesis that swans are white is wrong?

– Okay. But why ‘halfway’?

……………………………………………………..Abbé Boulah’s Adaptation of Popper’s Rule

– Look at the mantra: It says “tentatively accept” — and “to the extent” etc. Leaving the warning that we might have done some more rigorous testing, tried out some better hypotheses, to become more confident. But never totally certain. Part of that is getting into details, in science, about how to frame hypotheses and how they fit into more general theories, about probability and so on. But it’s actually clearer when we look at Abbé Boulah’s adaptation of that rule to planning: We don’t have tests on the basis of observation and measurement in planning, because planning is all about the future which isn’t here yet. So it replaces ‘test‘ with ‘argument‘. It goes something like this:

“We can accept a planning proposal as tentatively plausible

(only) to the extent we have done our very best

to expose it to the most critical arguments against it, (the ‘cons’)

and it has survived all those arguments —

meaning that the cons have been shown to either be flawed

or outweighed by the pro arguments in its favor”

– I see, Bog-Hubert. The ‘halfway’ point you mentioned is that just because we have countered all the con argument against a plan, it isn’t a certain proof like the black swan. But there may be better proposals, or it may solve the wrong problem or be the wrong way of talking about it?

– Couldn’t have said it better myself. And don’t forget that different folks doing the ‘weighing’ — to outweigh the cons, as the rule suggests — may come to different results.

………………………………………………………..Back to the Issue: What Are We Missing?

– So getting back to our issue here, professor: Would one of the theories explaining why we keep making the same mistakes with governance systems, be that we don’t know how to argue well enough about plans, — in all governance systems? And about governance systems?

– That’s a good candidate. I’d say it’s part of the problem. But there are others, I’m sure you know: you mentioned ‘systems thinking’, and other approaches, many consultants’ brands of institutional change management. But do we have to go through all of those?

– Well… I guess, when you said you didn’t have the answer to our first question, you also meant that you didn’t have a better idea than all those approaches?

– Well, what I meant was that there isn’t one big flaw, one attitude, or return to some previous article of faith, that can explain everything. About that, your systems folks are right, there are many forces in the system that interact in ways that we can’t say we fully understand, but they are valiantly trying to get that understanding.

……………………………………………………………………………….Understanding the System?But Acknowledging that We Don’t Know What the ‘Next System’ Should Be

– That’s what we have been saying all along — I mean Abbe Boulah and his buddy at the university. And if we acknowledge that: we see all the interesting theories and proposals and initiatives out there that people are putting out and pushing: we have to appreciate all those efforts, even if we suspect that some of them are not going to work out — but the thing is, we don’t know. We don’t know, and we don’t agreeon what the ‘Next System’ should look like. So the interim conclusion is that we should not only let them all pursue those diverse efforts. Even support them — on some conditions, of course.

– What are those conditions — I mean if you were in a position to impose conditions, which I’m not sure I’d be comfortable with?

– The conditions are simple: Don’t try to overthrow the ‘whole’ current system by violent or coercive means; and don’t get in the way of other experiments, even if you disagree with them. This also meas not to hurt or persecute people who don’t agree with you. That will probably require that the experiments will be small, local, tolerant of each other.

– Difficult enough to make that work smoothly…

– I agree. But there are two main arguments for that strategy: First, that there are so many different geographical, climatical, cultural and economic conditions that make it plausible to try out many different experiments according to those conditions. That itself might improve humanity’s resilience to emergencies and crises, some of which we can predict and some we can’t.

– Hmm. Makes some sense, though it’s not exactly a shining, promising vision to get an enthusiastic movement going. But what’s the other reason?

– Yes. The other reason for encouraging many diverse, even contradictory experiments? Again: even if we think there will have to be one global, unified ‘system’ to get us through all those crises, we currently have to admit, again: that we don’t know and are very far from agreeing about what that system should be like.

– I see that, yes: Our previous record with grand systems of global aspirations hasn’t been that encouraging. More importantly: if we try to impose one such system, by coercion. force, revolution, propaganda, and other devious means, it will only lead to resistance and wars — wars being of course not only one of the very problems we have to try to avoid, but increasingly so devastating to both losers and victors, that no grand system will help us to recover from them.

……………………………………………..Sharing Experiences For Discussion, Evaluation

– Yes. So the third condition for supporting ‘alternative’ schemes is that they must share their experiences — both successes and failures and problems they run into — so that we — humanity as a whole — can learn what works and what doesn’t, and can eventually come to agreement about whether we can fashion a global system that everybody can agree with and support.

…………………………………………………..Negotiating Common Road Rule Agreements

– Or, if the outcome is that a global unified system is not the way to go, what fewer global agreements we will have to negotiate to keep a more diverse strategy alive and prospering. The common ‘rules of the road’ we’ve been talking about.

– That does sound more like a vision. Still not exactly one you’ll get people to want to fight for though?

…………………………………………………………………………. A Vision Worth Fighting For?

– You mean the folks that like to fight to beat or kill or humiliate other people, or take their stuff? Yes, that will be a problem: if that’s what it takes to make one group or other ‘great’ and successful. But isn’t that precisely one of the problems we are facing, that we’d like to remedy?

……………………………………………………………………..Needed: The Discourse Platform

– Hey, Bog-Hubert: I’d like to hear more about that Abbeboulistic vision, if that’s what it is. How, well, I don’t know how to put this — how would you make that work?

– Great question, Sophie. Yes, there are some provisions we need to talk about. And some new tools. It eventually gets to the one issue that has been strangely neglected in all the approaches we see. Let’s see. First, I think we can agree that there must a platform, a forum, where all the experiences can be brought in for information, discussion, and working out preliminary agreements. There are some papers that describe what such a platform might be like.

– Who’d organize such a platform? Would it be the UN or something like it?

– That’s a big question. Many people would be against that, because it would be not only organized along the lines of existing systems — nations — that they see as part of the problem, especially as this translates into the decision-making rules it uses. Voting, for example. Small nations, big nations, money, the way nations are increasingly influences by big transnational corporations and other entities. And because it looks suspiciously like the Big Brother World Government many people are afraid of.

– So it would have to be ‘impartial’ with respect to any competing governance systems? Tall order.

– Right, Vodçek. We talked about some of the principles that should govern the ‘Public Planning Discourse’ that this entity would support — like the idea that decisions should be more determined by the merit of contributions, arguments, the pros and cons, to the discourse, than by votes. The papers we talked about have suggestions for that, we think, that should be discussed. It would take some effort to get people to learn and understand how that works. What you said a while ago, Sophie, that we don’t know how to argue planning proposals well and make decisions accordingly. Work to do.

– So was that what you were doodling about over there, Bog-Hubert?

………………………………………………………………The Missing Aspect: Control of Power

– Not really. I think we are ready to bring those ideas into the discussion. No, what I talked about with Abbeboulah was the issue that’s missing in all those approaches and theories — the issue nobody is talking about other than in the traditional terms that are part of the problem…

– Well, what’s that big problem?

– The problem of power, of course. Power. And how to control it.

– Why is that a problem? I mean, yes it’s a fact of life, like hunger and greed and diseases: they happen, it’s a constant battle. But don’t we have adequate provisions in place, I mean in the US and most liberal, democratic modern constitutions, — the separation of powers, elections, term limits and so on? So yes, it’s a problem, but we just have to make sure the rules are followed, don’t we? I’m not making light of it, I just don’t see …

…………………………………………………………………Inadequate Current Power Controls

– I understand, professor. And from what I know about these things, I think that these governance designs are some of humanity’s greatest achievements. But what we are seeing is that they are not enough: they are just provisions for the government segment of societies. And they are being overrun by other forces: technology, the economic power of so-called private business — the trans-national corporations, as well as the national and global finance sector. And who controls the media.

– Huh. Are you talking about election financing? Yeah, I agree that it’s disgusting. But people are beginning to see through that aren’t they? There are some campaigns that are getting huge amounts of money from small individual voter contributions only, and some billionaires who are trying to buy elections with their massive advertising campaigns aren’t doing so well at all?

– It’s much more than that, I’m afraid. Just take the one phenomenon as an example: The lobbyists in the capital, — yeah, they can’t give congressmen and government officials big expensive gifts anymore. But what they are doing is to get close to the representatives and senator’ aides, help them write the proposed bills, where they take advantage of the old ‘turkey’ tradition that allows bill sponsors to add funding for special interest ‘turkey’ projects for their constituents in laws that are mainly, and titled, about something else entirely. So you get trillion-dollar bills about the fight to deal with epidemics that have 500 billions worth of provisions that help big corporations, and billions worth of special help for owners of private business airplanes in them — that even people who are against such practices can’t vote down because that would kill or delay the main bill the people desperately need. And the lobbyists promise those aides lucrative jobs in their companies after their term is over.

– Good grief.

– You can say that again… And the leaders and even representatives of such ‘democracies’ who are ‘helping’ other countries kick out their dictators are selling those countries systems that are even more vulnerable to such power abuses — because they want to make sure their own corporate sponsors will make some nice profits in those places. Of course, those forces find it much easier to deal with the new power holders in those countries than to deal with the unruly, ignorant masses, — and using their economic contributions to help the addictive force of power to turn them into … dictators.

…………………………….What Can Be Done About Power? And What Shouldn’t Be?

– So what do you think ought to be done about that?

– The first thing is to get people to think about the problem — to inform them about all the abuse, which is difficult if the media are controlled by forces behind the shenanigans.

– Well, that’s one thing. And some people are getting all worked up about it — agitating, organizing protest rallies, ‘occupying’ this institution or that, getting themselves arrested — I don’t really see that helping. And if you get a little pandemic running, that lets you order people to stay home and arrest them if they get together in groups of more than ten people — all very conveniently justifiable of course — I wouldn’t begin to argue with that — you’ve got things nicely under control.

– So again: what do you think should be done about it? You got to have some solutions to offer people? Some ideas? First steps?

……………………………………………………………………………………………….Some First Ideas

– Yes. Well, it so happens that in the proposals for the Public Planning Discourse that we talked about, there are some provisions that could be used to begin to diminish the role of money and power in politics and public planning. But it’s just one part of the issue, no general panacea, and certainly not something that can bring overnight change. Part of the purpose of the Planning Discourse is to support the development and discussion of new ideas, new solutions. Ignite the creativity of all segments of society, not just disrupt, destroy, marginalize, suppress. That idea of ‘disruptive creativity’ is a dangerous one, likely to backfire no matter how brilliant.

– That’s certainly different from all the movement campaigns that are flooding my social media platforms: they all are just asking for money to promote their ideas, to dominate the development, not really encourage or contribute new ideas or insights.

– You’re not alone in that perception, Sophie. But let’s hear more of those ideas, Bog-Hubert.

– Okay, Vodçek. First let me remind you of the two general rules Abbé Boulah keeps repeating, that applies to these ideas too:

………….No Sudden Overthrow: ‘Gradual Parallel’ Systems Implementation

The first one is: Any new solutions — even if they are design for ‘global’ unified systems or agreements — should not be introduced ‘overnight’, by force or coercion or surprise, but gradually, on a small scale, ‘parallel’ to the existing systems. The old ‘skunkworks’ idea of R&D companies are a good model for that. Or his ‘innovation zones’ proposal, to introduce new systems first in areas (geographical or other) that have been devastated by natural or man-made disasters — so that they will be perceived as disaster recovery aid rather that as deliberate efforts to displace the old traditions and thereby generating unnecessary opposition from folks depending on the relative stability of conditions for their own means of survival.

– I remember, we did discuss those a while ago here, didn’t we?

……………………………………………..’Collateral’ Aspects Of Discourse Improvements:

…………………………………………………………Merit Points for Discourse Contributions

– Good, so we don’t need to spend much time repeating the details on that: but keep in mind that the proposals should meet that rule as much as possible. The other recommendation is that new system provisions should try to serve many different purposes simultaneously, not just one. And if you remember our discussions about the ‘side-effects’ or ‘collateral benefits’ of the notion of ‘merit points’ for citizen contributions to the planning discourse, you’ll see that they are a good example of that kind of idea, — and they could contribute to new, different power controls.

– Would it be useful if you could give us a brief summary of that, Bog-Hubert?

– I’ll try. It started with the investigation of how pro and con arguments about plan proposals could be evaluated, so that decisions could be more visibly and transparently linked to that merit. So there was the technique of developing a measure of plausibility for such arguments, and for constructing a measure of plausibility support for plan proposals. This seemed necessary to get around the problem that for projects to deal with ‘wicked’ problems affecting people in many different governance entities, decision-making by ‘voting’ is no longer a good tool (if it ever was): How to decide who is entitled to vote, for example? And are voters in different legislative bodies equally seriously affected, even adequately informed about the implications of the plan, etc.?

– Would be nice if that could be made to work, yes. But there’s more, to that collateral fallout, you say?

– Yes, Dexter. To encourage citizens to contribute such arguments — but also other information, ideas, it seemed useful to offer contributors some rewards for doing that: ‘merit points’. But not just for any wild and unsupported claims, and endless repetition of the same stuff: only the ‘first’ entries of the same essential content would ‘count towards points. We think that would encourage people to get their contributions in as fast as possible. And the entries would be evaluated by the discourse community for plausibility and importance, significance: rewarding plausible claims with adequate supporting evidence positively and lies, mere speculation and flawed thinking negatively.

– I see. This would become a kind of ‘valuable player’ account for contributing citizens?

– Right. A ‘reputation’ record of meaningful contributions.

– Interesting. What would that be good for?

– Excellent question, Vodçek. For the kind of ‘currency’ in such an account to become meaningful, it must become ‘fungible’, that is, people must be able to use it. A first use would be, I think you’d agree, that this record might become part of a candidate’s election or appointment to public office. It would be an indication not only of citizen’s interest an willingness to engage in public affairs, but also of their quality of judgment. Reckless, unsupported claims or outright lies would be getting ‘negative’ ratings, so a devious busybody’s account of a lot of nonsense entries wouldn’t be as valuable.

– I see the potential value in that idea, Bog-Hubert. But this part of our discussion started out with your wild claim about power being the big problem, didn’t it? So now you are just saying it helps getting better people into official positions. Those would be positions with power, wouldn’t they? Aren’t the people in such positions just as susceptible to the temptations of power than they are now? So how does this help the power problem?

…………………………………………………………………………………….Merit points and power?

– Ah: now we are getting to the interesting parts. Perhaps we should first see if we can agree on some basics about power. It’s not just a kind of necessary evil, that we can’t do anything about, but a key human desire, perhaps even something like a ‘right’. Part of our basic ‘right’ to the pursuit of happiness? If we acknowledge that people ‘need’ not only basic life necessities like food and shelter, and the relative absence of threats to those, but also the freedom to that pursuit — power, empowerment — to ‘make a difference’ in their lives. We think that’s a fundamental right, don’t we? And that becomes a problem only when it gets in the way of other people’s right to pursue their different forms of life and happiness.

– But don’t we need to have people in some kinds of power positions in any form of organized society?

– Right. commissioner. ‘Power to the people’ also means the people’s power to appoint people to positions where they make decisions on behalf of the rest of us. So we can feel secure in our smaller different pursuits. I guess you are worried about the conflict between the ideas we just mentioned, about how the people’s assessment of the merit of discussions about plans and policies plans that should determine the decisions, and now the sudden admissions that we need people in power to make such decisions for us? As you should be.

– Yes, that’s a good way to put it.

– Okay. Would it help to make some crude distinctions about the kinds of decisions that we need in society? One kind is the orderly ‘running’ of things in society — mostly routine, ‘maintenance’ decisions, carrying out the detailed implementation of policies. And also what to do in case there are unprecedented emergencies, for which there are no policies yet, and no time to wait for the outcome of careful thorough discussions to agree on them. The other kind are the policy issues themselves: for those, we need the discourse and wide popular participation and assessment of the merit of the information people contribute.

– I’m not sure the distinction is always as clear as you make it sound? Somewhat fuzzy?

– I agree, professor. But is it sufficient to see that we need both: the captain of the ship to decide whether to pass the iceberg to starboard or port, the chief engineer make the engine deliver the needed power to safely steer past it and not get driven into the ice below the surface, the helmsman to carry out the captain’s order?

– I see. So the merit point accounts would help us decide whether we can trust the different kinds of chiefs to make their respective decisions expertly and responsibly. And you are saying that current elections aren’t doing that well enough?

– Or that even initially well-intentioned, competent people can be ‘corrupted’ ? Yes — by money, or the promised they had to make to entities financing their election campaigns, or by the addictive power of power.

– So how does that merit point system deal with that problem?

– Well, don’t you see? The merit point accounts now contain a new ‘currency’. And that can be used, just like we use money to pay for life’s necessities, as a matter of course, to ‘pay‘ for the privilege of making important decisions. The more important, the more you’ll have to ‘pay’. And doing that, as a kind of ‘investment’ in your decisions, you’ll end up using up your credits. We must link the use of the power command buttons to the merit point account: no more credits in your account, no power for the button.

– Well. As the Norwegian Bachelor Farmers in Minnesota would say: That’s certainly ‘different‘.

– What do you know about Bachelor Minnesota Farmers, Sophie?

– Just listening to Garrison Keillor in my younger days…But what about necessary decisions that need to be made, even by a captain who’s used up his credit, or who’s called on to make decisions for which his account doesn’t have enough points in it?

………………………………………………………….Does ‘Accountability’ Require Accounts?

– Well, he may have supporters, won’t he? People who have accounts with some credits in them: can’t they transfer some of their credits to their great leader, to ’empower’ him or her to make those big decisions. Which now makes them ‘accountable’ too: the fancy talk about ‘accountability’ is really meaningless without there being an account that can be emptied out if you invest your hard-earned reputation in the wrong leaders or the wrong decisions you’ll empower them to make?

– Sounds better than just money, where we don’t know whether it was stole or ‘hard-earned’. I suppose we should be able to specify what kinds of decisions we are endorsing with our support points?

……………………………………………………….Implementation on a ‘Skunkworks’ Basis?

– Good idea. Well, the adoption of such a system would certainly be a topic for a wider discourse in a larger platform that our little gang of ‘taverniers de la table ronde’ here. But do you see how it could be started out as a ‘parallel’ system — perhaps as something the polling industry could take on as a ‘skunkworks’ project? Small, local, experimental, to see how it works?

– Careful: You’re cruising for a permanent labeling as a ‘merit point skunk’, my friend…

– By people who don’t have anything better to contribute, that would be a danger we’d have to live with. Until their current systems start exuding even stronger odors.

– Aren’t we there yet? So your doodling over there, that was about how to sneak such a system into the larger society, Bog-Hubert?

– Yeah. Well, it needs some more work, doesn’t show all the system parts. Aren’t there any systems thinking folks around that could help with that, Dexter?

– I can’t say I’m aware of any, off the cuff…

– Yes! Awareness! That’s it! That’s what we need! Right, Vodçek?

– Who said that? I’ll have to consider emergency power decisions…

EVALUATION IN THE PLANNING DISCOURSE: WEIGHTING

Thorbjørn Mann, April 2020

WEIGHTING: ‘WEIGHING THE PROS AND CONS’

Concepts and Rationale


      Much of the discussion, and examples in the preceding sections may seem to have taken the assumption of weighting for granted: aspects in formal evaluation procedures, or deontic (ought-) claims in arguments. The entire effort of designing a better platforms and procedures for public planning discourse is focused in part on exploring how the common phrase of “carefully weighing the pros and cons” in making decisions about plans could be supported by specific explanations of what it means and, more importantly, how it would be done in detail. Within the perspectives of formal evaluation or assessment of planning arguments (See previous posts on formal evaluation procedures and evaluation of planning argument), the question of ‘why’ appears not to require much justification: It seems almost self-evident that some of the various pro and con arguments carry more ‘weight’ in influencing the decision than others: Even if there is only one ‘pro’ and one ‘con’, shouldn’t the decision depend on which argument is the more ‘weighty’ one?
      The allegorical figure of Justice carries a balance for weighing the evidence of opposing legal arguments. (Curiously: the blindfolded lady is supposed to make her decision on the heavier weight, not on the social status or power or wealth of the arguing parties, but not even to see the tilt?) Of the many evaluation aspects of formal evaluation procedures, there may be some that really don’t ‘matter’ much to any of the parties affected by the problem or the proposed solution that must be decided upon. Decision-makers making decisions on behalf of others can (should?) be asked asked to explain their basis of judgment. Wouldn’t their answer be considered incomplete without some mention of which aspects carry more weight than others in their decision?
While it does not seem that many such questions are asked (perhaps because the questioners are used to not getting very satisfactory answers?), there is no lack of advice for evaluators about how they might express this weighting process. For example, how to assign a meaningful set of weights to different aspects and sub-aspects in an evaluation aspects ‘tree’. But the process is often considered cumbersome enough to tempt participants to skip this added complication of making such assignments, and and instead raising questions of ‘what difference does it make?’, whether it is really necessary, or how meaningful the different techniques for doing this really are. And there are significant approaches to design and planning that propose to do entirely without recourse to explicit ‘pro and con’ weighting.
       Finally, there are significant approaches to design and planning that propose to do entirely without recourse to explicit ‘pro and con’ weighting. Among these are the familiar traditions of voting, decision rules of ‘taking the sense’ of the discussion by a facilitator in the pursuit of consensus or the appearance of consensus or consent, upon more or less organized and thorough discussion, during which the weight or relevance, significance of the different discussion entries is assumed to have been sufficiently well articulated. Another is the method of sequential elimination of solution alternatives (for example by voting ‘out’, not ‘in’) until there is only one alternative left. A fundamentally different method is that of generating the plan or solution from elements (or according to accepted rules) that have been declared valid by authority, theory, or tradition, which are assumed to ‘guarantee’ that the outcome will also be good, valid, beautiful etc.
       Since the issue of evaluation is somewhat confused by being discussed in various different terms: ‘weights of relative importance’; ‘priorities’, ‘relevance’, ‘principles’, ‘preferences’; ‘significance’, ‘urgency’, and there are yet unresolved questions within each of the major approaches, some exploration of the issue seems in order: to revive what looks at this point as a needed, unfinished discussion.


                  Figure 1 — Weighting in planning evaluation: overview

Different ways of dealing with the ‘weighting’ issue

Principle
      A first, simple form of expressing opinions about importance is the use of principles in the considerations about a plan. A principle (understood as not only the ‘first’ and foremost consideration but a kind of ‘sine qua non’ or ‘non-negotiable’ condition) can be used to decide whether or not a proposed plan meets the condition of the principle, and eliminate it from further consideration if it doesn’t. Principles can be lofty philosophical or moral tenets, or simple pragmatic rules such as ‘must meet applicable governmental laws and regulations to get the permit’ — regardless of whether a proposed plan might be further refined or modified to meet those regulations, or an exemption be negotiated based on unusual considerations. If there are several alternative proposals to be evaluated, this usually requires several ’rounds’ of successive elimination identifying ‘admissible’, ‘semi-finalist’, ‘finalist’ contenders up to the determination of the winning entry, by means of one of the ‘decision criteria’ such as simple majority voting — which here would be not ‘voting ‘in’ for adoption or further consideration, but voting ‘out’.

Weight ‘grouping’
       A more refined approach that considers evaluation aspects of different degrees of importance is that of assigning those aspects to a few groups of importance, such as ‘highly important’; ‘important’ and ‘less important’, ‘unimportant’, ‘optional’ and ‘unimportant’, perhaps assigning aspects in these groups ‘weights’ such as ‘4’, ‘3’, ‘2’, ‘1’ and ‘0’, respectively, to be multiplied with a ‘quality’ or ‘degree of performance’ judgment score before being added up. The problem with this approach can be seen by considering the extreme possibility of somebody assigning all aspects the highest category of ‘highly important’; in effect making all aspects ‘equally important’ — for n aspects each one contributing 1/n weight to the overall judgment.

Ranking and preference
      The approach of arranging or ‘ranking’ things in the order of preference (on the ordinal scale) can be applied to the set of alternatives to be evaluated as well as to the aspects to be used in the evaluation. Decision-making by preference ranking — e.g. for the election of candidates for public office — has been studied more extensively, e.g. by Arrow [1], finding unsurmountable problems for decision-making by different parties, due mainly to ‘paradoxical’ transitivity issues. Simple ranking does not recognize measurable performance (measurable on a ratio or difference scale) where this is applicable, making a coherent ‘quality’ evaluation approach based only on preference ranking impossible.
      An interesting variation of this approach is an approach for deciding whether a proposal should be rejected or accepted, attributed to Benjamin Franklin. It consists of listing the pros and con arguments in separate columns on a sheet of paper, then looking for pairs of pros and cons that seem to be equally important, and striking those two arguments out. The process is continued until only one argument, or one pair, is left; if this, or the weightier one of two is a ‘pro’ argument, the decision will be in favor of the proposal, if it is a ‘con’ argument, the decision should be rejection. It is not clear how this process can be applied to group decision-making without recourse to other methods of dealing with different outcomes by different parties,such as voting.
Interestingly, preference or importance comparison is often suggested as a preliminary step towards developing a more thoroughly considered set of weightings in the next level:

Weights of relative importance
       As indicated above, the technique of assigning ‘weights of relative importance’ to the aspects on each ‘branch’ of evaluation aspect trees has been part of formal evaluation techniques such as the Musso-Rittel procedure [2] for buildings (discussed in previous posts) as well as in proposals for systematic evaluation of pro/con arguments [5]. These weights of relative importance — expressed on a scale of zero to 1 (or zero to 100), subject to the condition that all weights on the respective level must add up to 1 (or 100, respectively), indicate the evaluator’s judgment about ‘how much’ (by what percentage or fraction) of the overall judgment each single aspect judgment should determine the overall judgment. In this view, the use of the ‘principle’ approach above can be seen as simply assigning the full weight of 1.0 or 100% to the one of the aspects expressed in the discussion that the evaluator considers a principle — , overriding all other consideration aspects.
      To some, the resulting set of weights may seem somewhat arbitrary. The task of having to adjust the weights to meet the condition of adding up to 1 or 100 can be seen as a nudge to get evaluators to more carefully consider these judgments, not just arbitrarily assign meaningless weights: To make one aspect more important (by assigning it a higher weight), that added weight must be ‘taken away’ from other aspects.
        Arbitrariness can also be reduced by using the Ackoff technique [3] of generating a set of weights that can be seen as ‘approximately’ representing a person’s true valuation. It consists of ranking the aspects and assigning numbers (on no particular scale) and then comparing each pair of aspects, deciding which one is more important than the other, and adjusting the numbers accordingly, until a set of numbers is achieved that ‘approximately’ reflects a evaluator’s ‘true’ valuation. To make this set comparable to other participants’ weighting, (so that the numbers carry the same ‘meaning’ to all participants), it must then be ‘normalized’ by dividing each number by the total, getting the set back to adding up to +1 (or 100). Displaying these results for discussion will further reduce arbitrariness. This can actually induce participants to change their weightings to reflect recognition and (empathy) accommodation for others’ concerns that they had not recognized in their own first assignments. Of course, the discussion requires that the weighting is made explicit.
      Taking the ‘test’ of deliberation seriously — of enabling a person A to make judgments on behalf of another person B — this can now be seen to require that A could show how A can use not only the set of aspects and the criterion functions but also B’s weight assignments for all aspects and sub-aspects etc., and of course the same aggregation function, resulting in the overall judgment that B would have made. It likely would be different from A’s own judgment using her own set of aspects, criteria, criterion functions and weighting. The technique using weights of relative importance thus looks like the most promising one for meeting this test. By extension, to the extent societal or government regulations are claimed to be representative of the community’s values, what would be required to demonstrate even approximate closeness of the underlying valuation?

Approaches avoiding formal evaluation       

The discussion of weighting would be incomplete without mentioning some examples of approaches that entirely sidestep the issue of evaluation of plans of the ‘formal evaluation’ kind and others using weighting. One is the well know Benefit-Cost Analysis, the other relies on the process of generating a plan following a procedure or theory that has been accepted as valid and guaranteeing the validity or quality of the resulting design or plan or policy.

Expressing weights in money: Benefit-Cost Analysis
      The Benefit-Cost Analysis is based on the fact that the implementation of most plans will cost money — cost of course being the main ‘con’ criterion for some decision-making entities. So the entire question of value and value differences is turned into the ‘objective’ currency of money: are the benefits (the ‘pros’) we expect from the project worth the cost (and other ‘cons’)? This common technique is mandatory for many government projects and policies. It has been so well described as well as criticized in the literature, that it does not need a lengthy treatment here; though some critical questions it shares with other approaches will be discussed below.

Generating plans by following a ‘valid’ theory or custom
       Approaches that can be described as ‘generative’ design or planning processes rely on the assumption that following the steps of a valid theory or using rules and elements that constitute the whole ‘solution’, (elements that have been determined as ‘valid’) to construct the plan, will thereby guarantee its overall validity or quality. Thus, there is no need to engage in a complicated evaluation at the end of that process. Christopher Alexander’s ‘Pattern Language’ [4] for architecture and urban design is a main recent example of such approaches — though it can be argued that it is part of a long tradition of similar efforts of rules or pattern books for proper building, going back to antiquity — either as cultural traditions know to the community or as ‘secrets’ of the profession. He claims that following this ‘timeless way’ of building “frees you from all method”.
      However, the argument that the individual patterns and rules for connecting these elements into the overall design somehow ‘guarantee’ the validity and quality of the overall design (if followed properly) merely shifts the issue of evaluation back to the task of identifying valid patterns and relationship rules. This is discussed — if at all — in very different language, and often simply posited by the authority of tradition (‘proven by experience’) or — as in the Pattern Language — of its developer Alexander or followers writing patterns languages for different domains such as computer programming, ‘social transformation’, or composing music. To the best of my knowledge, the evaluation tools used in that process remain to be studied and made explicit. The discussion of this issue is somewhat more difficult than necessary because of Alexander’s claim that the quality of patterns — their beauty, value, ‘aliveness’ — is ‘a matter of objective fact’.

Do Weighing methods make a difference?

      A question that is likely to arise in a project whose participants are confronted with the task of evaluating proposed plans, and therefore having to choose the evaluation method they will use, is whether this choice will make a significant difference in the final judgment. The answer is that it definitely will, but the extent of difference will depend on the context and circumstances of each project — especially if there are significant differences of opinion in the affected community. The trouble is that the extent of such differences can only be seen by actually using some of the more detailed techniques for a given project, and comparing the decision outcomes; an effort unlikely to be taken in a situation where the question of whether one technique is ‘worth the effort’ at all.
      The table below shows a very simple example of such a comparison. For the stated assumptions of a few evaluation aspects and weighting assignments, the different ways of dealing with the weighting issue actually will yield different final plan decisions. This crude example cannot, of course, provide any general guidelines for choosing the tools to use in any specific project. The above list and discussion of policy decision options can at best become part of a ‘toolkit’ from which the participants in each project can choose to construct the approach they consider most suitable for their situation.

       Table 1 Comparison of the effect of different weighting approaches


      The ‘weights of relative importance’ form of dealing with the issue of different degrees of importance in the evaluation considerations is used both in the formal evaluation procedures oft the Musso-Rittel type and, in adaptation, in the argument evaluation approach for planning arguments [5]; It may be considered most useful for approximately representing different bases of judgment. However, even for that purpose, there are some questions — for all these forms — that need more exploration and discussion.

Questions and Issues for further discussion

       Apart from the question whether the apparent conflict between evaluation techniques using weighting approaches, and those avoiding evaluation and thus weighting at all, can be settled, there are some issues about weighting itself that require more discussion. They include contingency questions: about the stability of weight assignments over time and different, changing context conditions, their applicability at different phases of the planning process, and the possibilities (opportunities) for manipulation through bias adjustments between weights of aspects and the steepness (severity) of criterion functions for those aspects.

The relationship between weighting and the steepness of criterion functions
       A perhaps minor detail is the relationship between the weight assignments of evaluation aspects, and the criterion functions for that aspect, in a person’s ‘evaluation model’. A steep criterion function curve can have the same effect as a higher weight for the aspect in question. To some extent, making both the weighting and criterion functions of all participants explicit and visible for discussion in a particular project may help to counteract undue use of this effect, e.g. by asking where the criterion function should cross the ‘zero’ judgment (‘so-so, neither good nor bad but anything above that line still acceptable’) and thus prevent extreme severity of judgments. This would assume considerable sophistication on the part of individuals attempting such distortion and of other participants in the discourse to detect and deal with it. But both in personal assessments and in efforts to define common social evaluations (regulations) expressed in terms of criterion functions such as e.g. implied by the suggestions in [8] there remains a potential for manipulation that at the very least should encourage great caution in accepting evaluation results as direct decision criteria.

Tentative conclusions and outlook

     These issues suggest that it is far from clear whether they can eventually be settled in favor of one or the other view. What does this mean for the the concern triggering this investigation, to explore what provisions should be made for the evaluation task in the design of a public planning platform? Any attempt to pre-empt the decision, by mandating one specific approach or technique should be avoided, to prevent it from itself becoming an added controversy distracting from the task of developing a good plan. So given the current state of the discussion, for the time being, should the platform offer just participants information — a ‘toolkit — about the possible techniques at their disposal? Can ‘manuals’ with guidance for their application, and perhaps suggestions for circumstances in the context or the nature of the problem, offer discourse participants in projects with wide, even global participation adequate guidance for their use? Or will it take more general education to prepare the public for adequately informed and meaningful participation?

     The emerging complexity of the issues discovered about even this minor component of the evaluation question could encourage opponents of these cumbersome procedures. Are calls for stronger leadership (from groups asking for leadership with systems thinking, better ‘awareness’ of ‘holistic’, ecological, social inequality issues, or other moral qualities actually indicators of public unwillingness to engage in thorough evaluation of the public planning decisions we are facing? Or just inability to do so? Inability caused perhaps by inadequate education for such issues, compounded by inadequate information and lack of accessible and workable platforms for carrying out the needed discussions and judgments? Or is there also some power desire at play, for such groups to themselves become those leaders, empowered to make decisions for the ‘common good’?

Notes, References

[1] Kenneth J. Arrow, 1951, 2nd ed., 1963. Social Choice and Individual Values, Yale University Press.
[2] Musso, A. and Horst Rittel: “Über das Messen der Güte von Gebäuden” In “Arbeitsberichte zur Planungsmethodik‘, Krämer, Stuttgart 1971.
[3] Ackoff, Russel: “Scientific Method” , John Wiley & Sons 1962.
[4] Alexander, Christopher: “A Pattern Language“, Oxford University Press, 1977.
[5] Mann, T: ‘The Fog Island Argument’ XLibris, 2009, or “The Structure and Evaluation of Planning Arguments” , INFORMAL LOGIC, Dec. 2010.
[6] Mann, T.: “Programming for Innovation: The Case of the Planning for Santa Maria del Fiore in Florence”. Paper presented at the EDRA (Environmental Design Research Association) Meeting, Black Mountain, 1989. Published in DESIGN METHODS AND THEORIES, Vol 24, No. 3, 1990. Also: Chapter 16 in “Rigatopia — the Tavern Discussions“, Lambert Academic Publication 2015.
[7] Mann, T: “Time Management for Architects and Designers” W. Norton, 2003.
[8] “Die Methodische Bewertung: Ein Instrument des Architekten. Festschrift für Professor Arne Musso zum 65. Geburtstag“, Technische Universität Berlin, 1993; Also: Höfler, Horst: Problem-Darstellung und Problem-Lösung in der Bauplanung. IGMA-Dissertationen 3, Universität Stuttgart 1972.

                                                     –o–

EVALUATION IN THE PLANNING DISCOURSE: CRITERIA AND CRITERION FUNCTIONS

An effort to clarify the role of deliberative evaluation in the planning and policy-making process. Thorbjørn Mann, March 2020

CRITERIA AND CRITERION FUNCTIONS

Concepts and Rationale

One of the key aspects of evaluation and  deliberation was discussed earlier (in the section on deliberation) as the task of explaining to one another the basis of our evaluation (quality / goodness) judgments, to one another: ‘objectification’. It means to show how a subjective ‘overall’  evaluative judgment about something is related to, or depends on other — ‘partial’ —  judgments, and ultimately, how a judgment is related to some objective feature  or ‘criterion’ of the thing evaluated: a measure of performance. Taking this idea seriously, the concept, its sources, and the process of  ‘making judgments a function of other judgments’ and especially of criteria, should be examined in some more detail. 

There is another reason for that examination: it turns out that criterion and criterion functions may offer a crucial connection between different ‘perspectives’  involved in evaluation in the planning discourse:  the view of ‘formal evaluation’ procedures such as the Musso-Rittel procedure [1, 2] , the systems modeling domain, and the argumentative model of planning. 

The typical systems model is concerned with exploring and connecting the ‘objective’  components of a system, the variables describing the interaction between them, for example in ‘simulation models’  of the systems behavior over time. The concern is with measures of performance:  criteria. The systems model does not easily get involved with evaluation — since this would have to tackle the issues of the subjective nature of individuals’ evaluation judgment: the model output is presented to decision-makers for their assessment and decision; but also often falls victim to the temptation of declaring some ‘optimal’ value of an objective performance variable to be the proper basis for a decision. 

The familiar ‘parliamentary’ approach to planning and policy decision-making accepts the presentation of a proposed plan and the exploration of its ‘pros and cons’ — arguments as the proper basis for decisions (but then reverts to voting as the decision-making tool, which potentially permits disregarding all concerns of the voting minority, a different problem). The typical arguments in such discussions or debates rarely get beyond invoking ‘qualitative’ advantages and disadvantages: — evaluation ‘aspects’  in the vocabulary of formal evaluation procedures –,  and refers to quantitative effects or consequences (criteria‘) only in  a rhetorical  and not very systematic manner.  The typical ‘planning argument’ assumption of the conditions under which its main instrumental premise will hold (ref. the section__ on argument evaluation) is usually not even made explicit — taken for granted — even though it would actually call for a thorough description of the entire system into which the proposed plan will intervene, complete with all its quantitative data and expected tracks into the future. 

These considerations suggest that the concepts of criteria and criteria functions can be seen as the (often missing) link between the systems modeling view, the argumentative discourse, and the formal evaluation approach to planning decision-making. 

Criteria types

Another understanding of ‘criterion’ pertains to the assessment of that explanation: the level of confidence of a claim, the level of plausibility of arguments; and the degree of importance of an aspect. Since the assessment of plans involves expected future states of affairs that cannot be observed and measured as matters of fact in reality (not being ‘real’ yet, just estimated, predicted), those estimates even of ‘objective’ features must be considered ‘subjective‘ no matter how well supported by calculations, systems simulation, and  consistency of past experience with similar cases. The degree of certainty or plausibility of such relationships may be considered by some as an ‘objective fact’ feature of the matter — but the decisions we make and  refer to in our explanation of out judgments  to each other are subjective estimates of  that degree — and that will the result of  the discussion, debate, deliberation of the matter at hand. These criteria may be called ‘judgment assessment criteria’.

‘Solution performance’ criteria

These criteria are well known, and have been grouped and classified in various ways according to evaluation aspect categories, in architecture starting with Vitruvius‘ triad of aspects ‘firmness, utility (commodity) and delight’ (beauty). Interestingly enough, the explanation of beauty held more attention in terms of exploring measurable criteria such as proportion ratios than the  firmness and utility aspects. In the more recent ‘benefit/cost’  approach  that concern somehow disappeared or has been swallowed up in one of the‘benefit/cost’ categories that measures both kinds with the criterion of money, which arguably is more difficult to connect with beauty in a convincing manner. Meanwhile, engineering has made considerably more progress in actually calculating structural stability, bearing loads of beams and trusses, resistance on buildings to wind loads, thermal performance of materials etc. 

For all the hype about functional aspects, the development of adequate criteria has been less convincing: the size of spaces for various human activities, or walking distance between different rooms in places like hospitals or airports are admittedly easy to measure but all seem to be missing something important. That sense of something missing may have been a major impulse for the effort to get at ‘Quality’  in Christopher Alexander’s efforts to develop a ‘Pattern Language‘  for  architecture and environmental design. [3]; ‘Universal design’ looks at functional use concerns of spaces by people with various disabilities but has paid more attention to suggesting or prescribing actual design solutions than to develop evaluation criteria.  My explorations of a different approach try to assess the value of buildings by looking at the adequacy of places in the built environment for the human occasions they are accommodating, as well as the image the design of the place is conveying to occupants, and developing criteria  such as ‘functional occasion adequacy’ and ‘occasion opportunity density’, ‘image adequacy‘ [4]   Current concerns about ‘sustainability’ or ‘regenerative’  environmental design and planning seem to claim more, even dominant attention than earlier aspects; the development of viable evaluation criteria has not yet caught up with the sense of crisis. (An example is the attention devoted to the generation and emission of CO2 into the atmosphere: it seems to play a crucial role in global climate change — but the laudable proclamation of governments or industries of plans to‘achieve a level of  x  emission of CO2 within y years’ ) seem somewhat desperate (just doing something?) but not addressing the real effects either of the climate change itself, or the question of ‘what about the time after and up until date y’  — (paying ‘carbon offsets’ or taxes?). 

Measurement scales for criteria

In exploring more adequate criteria, to guide planning decisions it is necessary to look at how criteria are measured:  both to achieve better ‘objective’ basis for comparing alternative plans and to avoid neglecting important aspects just because they are and remain difficult to measure in acceptable objective ways, and will have to rely on subjective assessments by affected parties. 

The ‘qualitative’ assessment of evaluation aspects will  use judgments on the nominal and ordinal scale — both for the ‘goodness’ judgments and the ‘criteria’ . The fact that these are mostly subjective assessments does not relieve us from the need to explain to each other how they are distinguished, what we mean by certain judgments, and how they are related: that is, how ’criteria’ judgments explain ‘goodness’ judgments:  the question of criterion functions that usually focus on explaining how our subjective ‘goodness’ judgments  relate to (depend on)  ‘objectively measurable performance criteria’

Criterion functions

Types of criterion functions

The concept of ‘criterion function’ was defined as the demonstration of the relationship between subjective quality judgments and (usually) objective features or performance measures of the thing evaluated,  in the form of verbal explanation,  equations, or diagrams. 

A first kind of distinction between different kinds of such explanations can be drawn according to the scales of measurements used for both the quality judgments and the criteria. The following table shows some basic types based on the measurement scales used: 

Table 1 — Criterion function types based on judgment scales

For simplicity, the types for the difference and ratio scale are listed together as ‘quantitative’ kinds. Further distinctions may arise from consideration whether the scales in question are ‘bounded’  by some distinct value on one or both ends, or ‘unbounded‘ — towards +∞ or -∞.  

Another set of types are related to the attitudes of where the ‘best‘ and ‘worst‘ features are located:   The attitudes will call for different shapes of diagrams: 

“The more, the better”; “The less, the better”; “The value x on the criterion scale is best’,  smaller or larger values are worse”; “The value x on the criterion scale is worst;  lower or higher values are better”.

 

Further distinctions may arise from consideration whether the scales in question are ‘bounded’  by some distinct value on one or both ends, or ‘unbounded‘ — towards +∞ or -∞.  The attitude ‘the more, the better’ will have the ‘couldn’t be better’ score at infinity; while it will be at zero (or even -∞ ?)  for the opposite ‘the less, the better’ ; or the best or worst scores may be located at some specific value x of the performance criterion scale.

Criterion function examples

Table 8.2  Criterion functions type 1 and 2

Table 8.3  Criterion functions  type 3,4,5,6 

A common type 6 example with a bounded judgment scale and the performance scale bounded on zero at the low end and unbounded at +∞ at the other end is the following: Asked  to explain the basis of our subjective ‘goodness / badness’ (or similar) judgment about a proposed plan, we can respond by drawing a diagram showing the objective performance measurement scale with its units as a horizontal line, and the judgment scale on on the vertical axis. For example, judging the ‘affordability’ of proposed projects, on a chosen judgment scale of  -3 to +3, with +3 meaning ‘couldn’t be more affordable’ , the -3 meaning ‘couldn’t be more unaffordable’, and the midpoint of ‘zero’  meaning ‘can’t decide, don’t know, cannot make a judgment’. 

Figure 2  A type 6 criterion function of ‘affordability’ judgments related to the cost of a plan

In the following, the discussion will be focused mainly on functions of type 6 — judgments expressed on a +U to -U  scale (e.g. +3 to -3) with a midpoint of zero for ‘don’t know, can’t decide;  neither good nor bad’, and some quantitative scale for the performance criterion. 

The criterion function lines can take different shapes, depending on the aspect. For some like the cost aspect in the first example above, the rule ‘the more, the worse’ will call for a line declining towards the +∞ right; many aspects call for a ‘the more, the better’  rising from zero to (or -∞?) towards +∞ on the  opposite end  for others there may be a ‘best’  or ‘worst’ point in-between value. Some people may wish to have a building front in what is widely consider the ‘most beautiful’ proportion, the famous ratio 1:1.618…

Figure 8.3  — Four different ‘attitude’ curves of type 6 criterion functions

Expectations for Criterion functions; Questions

There are some aspects of rationality attached to the criterion function  concept. The line expresses the judgment of a cost-conscious client, and of course getting the project ‘for free’  would deserve the score of +3 ‘couldn’t be better / more affordable’. The line would approximate the bottom judgment of -3 towards infinity: for any cost however large, it could be even worse. So +3 and -3 judgment scores should be assigned only if the performance  r e a l l y  couldn’t be better or worse, respectively. Furthermore, we would expect the line to be smooth, in this case smoothly descending: it should not have sudden spikes or valleys. If the cost of a Plan A could be reduced somewhat, the resulting score for the revised solution A’ should not be lower than the score for the original version of A.  But should that prohibit superstitious evaluators from showing such dips in their criterion function lines, e.g. for superstitiously ‘evil’ numbers like 13? There are many building designs that avoid  heights resulting in floor levels with that number — or,  if the building are higher, just don’t show those floors on the elevator buttons?. Al

 Where should  a person’s judgment line ‘reasonably’ cross the 0 axis?  It might be the amount of money the client has set aside for the project budget: as long as it’s on the ‘+’ side, it’s ‘affordable’ and the lower the cost, the better;  the more, the worse and less affordable. This shows that the judgment line, the ‘criterion function‘,  will be different for different people: it is subjective, because it depends on the client’s budget (or credit limit), but ‘objectified’ (explained) by showing how the affordability judgment score relates to the actual expected cost. For a wealthier client, the line would shift toward the right; a less affluent client would draw it more steeply to the left. (Of course even this simple and plausible criterion might raise discussion:  does ‘cost’ mean ‘initial construction cost’  or ‘client equity’, or some time-related cost such as ‘average annual cost including mortgage payments etc.’ or ‘present value of all the costs, initial plus annual costs (each ‘discounted back to present worth) for a specified planning horizon”?)

There can be more complex criterion functions, with two or more variables defining judgment categories.  Figure 8.4 shows a function diagram for clothing sizes (pants) — a variation of the simple version in the above example. The sizes are roughly defined by ranges of  pants legs and waistlines (roughly, because different manufacturers styles and cuts will result in some overlap  in both dimensions). The judgment scale is the‘comfort’ or ‘fit’  experienced by a customer expressed e.g. on the +3 to -3 scale. The ‘best’ combination would of course be the ‘bespoke’ solution that can only be achieved by the tailor creating the garment for the specific measurements of  each individual customer. The ‘fitness’ judgments — in the third dimension — will be a smooth mountain with its  +3 top located above the specific measurement, with widening altitude lines (isohypses) for less perfect fits. The ‘so-so’,  or ‘just acceptable’ range would  cover an  area  within one of the ‘size’  regions, or actually overlapping borders. The area would likely be a kind of ellipse with a narrower range for leg length and allowing for a greater variation of waistline (accommodated by a series of  holes in the belt, for before-and after dinner adjustment…).  This example also demonstrates nicely that the evaluation judgment ‘fit’ is a personal, ‘subjective’ one even when it involves an ‘objectively’ measurable variable. 

Figure 8.4  —  A ‘3D’ criterion function for a ‘feature’ domain defined by two variables.

In his Berkeley lectures, Rittel proposed mathematical equations for the four basic function shapes, to calculate the judgment scores for different performance values. This may be useful for evaluation tools that have been agreed upon as‘standards’, such as government regulations.   However, for lay participants to express their individual assessments, to specify the different parameters to generate the specific curves would be unrealistic. Should  they not be a fuzzy broad band expressing the approximate nature of these judgments, instead of a crisp fine line? 

Figure 8.5  ‘Should the criterion function be drawn as ‘fuzzy’ lines? 

 So it will be more practical to simply ask participants to draw their personal line by hand, with  a fat pencil or brush, after indicating the main preferences about the location of ‘best‘ and ‘worst’ performance, and where the lines should cross the center ‘zero’ judgment axis. 

Equations expressing evaluation judgments would undoubtedly be desirable if not necessary for AI tools that might aim to use calculations with successive approximation to find ‘optimal’ solutions. But whose evaluations should those be? The discussion of evaluation so far has shown that while the judgment part of objectification are subjective; getting ‘universal’ or societally accepted ‘norms would require agreed-upon (or imposed, by evaluation or systems consultants)  aggregated ‘curves’.(see the sections on aggregation and decision criteria). Some authors [5, 6 ]simplify or circumvent this issue by proposing simple straight lines — for which equations will be easy to establish —  such as in one of the examples below, suggesting that these should be used as common policy tools like government regulations. This needs more discussion. The example shows a criterion function for the predicted total cost for a specific project, with the function crossing the ‘zero’ line at some ‘neutral’ or ‘acceptable’ value of cost for a building of the given size.  The question arises why a solution achieving a lower cost than that of 105,000 (where the line breaks from +5 into the sloped line) should not get a better judgment; but it would be more cumbersome to establish the equation for a curve that gradually approximates the zero cost on the left and the infinitely high cost on the right and also crosses the zero judgment line at the selected neutral value of 210.000. The equation shown in the second line Y.1  of the example is easy to generate and use, but arguably somewhat arbitrary. 

Figure 8.5  Simplifying the judgment curves with straight lines  [5]

Criteria for discourse contributions; judgments, argument assessment. 

Could the criterion functions be modified according to the plausibility (confidence) assessment for the evaluation of argument plausibility (whose deontic ought-premise is identical or 

conceptually linked to the ‘goodness’ aspect of  formal evaluation of the Musso-Rittel type? The corresponding criterion function lines would be the more ‘flattened’ towards the ‘zero’ line honestly representing ‘don’t know’ the closer the plausibility judgment for the deontic premise approaches that zero value (on the assumed -1 to +1 plausibility scale). This would still express the person’s preferences on the criterion, but adjust the impact of that aspect according to the level of confidence of the solution pursuing and achieving that goal.  

Figure  8.6 —  Plausibility – modified quality criterion functions.

There are also questions about the way these functions might be manipulated to generate ‘bias’. For example: A participant who has assigned the aspect in question a low weight of relative importance might be tempted to draw the criterion function line steeper or less steep to increase that aspect’s impact on the overall assessment. 

The extent to which participants will be led to consider evaluation aspects or arguments contributed by other participants and make them part of their own evaluation will depend on the  

degree of confidence, plausibility, credibility with which these entries are offered:  how well is a claim ‘supported’ by evidence or further arguments? This aspect is  sometimes discussed  in general terms of  ‘breadth‘  and ‘depth’  of support, or in more ‘scientific‘ terms  by the amount of ‘data’, the rigor of collecting the data and analyzing its logic of inference and statistical ‘significance’. It should be obvious that simple measures such as ‘counts’ of  claims oaf breadth (the number of  different claims made in support of a judgment) and depth  (the number of claims supporting claims and their support, aspects, sub-aspects,  arguments and support of premises and evidence for each premise etc.) are meaningless if those claims lack credibility and plausibility, or are entirely ‘made up’. 

Support of claims or judgments can be ‘subjective’ or ‘objective’. But the general attitude is that ‘objective’ claims well supported by ‘facts’ and ‘scientific validity‘ carry a greater strength  of obligation for others to accept in their ‘due consideration‘ than ‘subjective’ claims without further supporting evidence or argument that each person may have the right to believe but cannot expect everybody else to accept as theirs. A possible rule may be to introduce the concern for others as a standard general aspect in the overall evaluation aspect list, but keep the impacts of  a criterion on one’s own part and the impact on others on separate aspects and criterion functions. The weight or impact (i.e. how much are our judgments influenced  by somebody’s argument or claim) we then accord that aspect in our own judgment will very much depend on the resulting degree of plausibility. All this will be a recurring issue for discussion.

The ‘criterion function’  for this second kind of criterion:  plausibility,  will take a slightly different  form (and mathematical expression, if any) than those pertaining to the object goodness assessment. For example, the plausibility of a pro or con argument  can be expressed as the product (multiplication) of the plausibility judgments pl of all its (usually two or three)  premises:  

Argpl(i)  = pl(FI-premise) * pl(D-premise) *pl(F-premise) * pl(Inference rule)

of the standard planning argument :  

D(PLAN A)  (Plan A ought to be adopted) because 

FI(A–> Outcome B given conditions C) (Given C, A will produce B) , and

D(B) (B ought to be achieved), and

F(C) (Conditions C are / will be present).

The ‘criterion function’  for this assessment for only the two main premises  FI (A –> B)  and D(B) , takes the form of a 3D surface in the ‘plausibility cube:

Figure  8.7 — Argument plausibility as a function of  premise plausibility (two premises)

Here, D(x) denotes the Plan proposal, D(y) is the deontic claim of desired outcome, and F(xRELy)  is the factual-instrumental claim that Plan x will produce outcome y.  [7]

Evaluation and time: changing future performance levels

All assessments of ‘goodness’  (quality) or plausibility of plans are expectations for the future. So judgments about a plan’s effectiveness are — explicitly or implicitly — based on some assumption about the time in the future at which the expected performance will be realized. For some kinds of projects, it will be meaningful to talk about plan effects immediately on implementation: ‘fixing’ a problem for good when executed.  The Musso/Rittel  and similar criterion functions are based on that assumption.  However, many if not most public plans will reach full effectiveness only after some initial ‘shake-down’ period  (during which the problem may actually be expected to first get worse before getting better) and to different degrees over time.  For most plans, a ‘planning horizon’ or life span is assumed; expected benefits will vary over time, and eventually decline and stop entirely. The only specific assessment criteria  for this are the computations  of economic aspects:  initial versus recurring costs and benefits, and their conversions into‘present value’ or ‘annual’ or ‘future value’ equivalents , based on personally different discount rates for the conversion, and estimates of ‘planning horizons’. 

 As soon as this is taken into consideration, it becomes obvious that difference of opinion may be based on different assumptions about this, and the need arises for making these assumptions more explicit. This means that the expected ‘performance track’ of different plan solutions over time should be established  and made visible in the evolving criterion functions. This aspect (to my knowledge) has not been adequately explored and integrated into evaluation practice.  

 Figure 8.8.  Evaluation of alternative plans over time

The diagram is a first attempt at displaying this. It could be seen as the task of comparing  two different plans for dealing with the issue of human CO2 emissions, with the expected ‘do nothing‘ alternative. One plan (A) will  show continuing emission levels for some period before reversing the direction of the trend);  the other  (B)  is assumed to take effect immediately but not as strongly as plan A.  This suggests that the better basis of comparison would be the ‘areas’  of ‘improvement’ or ‘worsening’ in the judgment surface  over time — in the diagram shaded for plan A. 

Anther question arising in this connection — besides the suggestion above that the expected trends should be drawn as ‘fuzzy’, broad tracks, rather than the crisp lines printed out by the computer simulations — is the aspect that any plausibility (probability) estimates for the predictions involved are also likely to decline, from initial optimistic certainty down and more honestly towards the zero middle line — “not sure”, “don’t know”. 

Preliminary conclusions

For discussion, including criterion functions in the deliberation process offers some interesting improvement possibilities compared to conventional practice:

  •   A more detailed, specific description of the basis of judgment of participants in the discourse;
  •   The ability to develop overall  group or ‘community’ measures of collective merit of proposed plans, with specific indication of the plan details about which participants disagree, and thus opportunities for finding plan modifications leading to improvements of assessment and acceptance.  For example, while it is possible to construct functions based on preference rankings of solutions, (which do not show the spread of the ranking scores, nor the overall location of ranking clusters on a performance measure scale), the comparison of criterion function curves can facilitate the identification of ‘overlap’ regions of acceptable solutions.
  • It should be obvious that overall ‘group’  assessment indicators must be based on some aggregation of individual (or partial group) judgment scores; these can then be used for varieties of Pareto-type analysis and decision criteria. Instead of just using e.g. ‘averaged’ group scores — or scores ‘weighted’ by the number of members of the different subgroups or parties — decision criteria based on such aspects as degrees of improvement offered to different subgroups by the different plan versions can be developed. (See the section on decision criteria, which should be clearly distinguished from the evaluation criteria discussed here, used for individual assessment) 

The questions arising from this tentative discussion suggest that this part of the evaluation component of planning discourse, and especially public planning discourse, with wide public participation  by affected parties spread over different administrative constituencies, need more research and discussion. 

–o–

References

[1]  Musso, Arne and Horst Rittel: “Über das Messen der Güte von Gebäuden” in “Arbeisberichte zur Planungsmethodik 1” Stuttgart 1969. In English: “Measuring the Performance of Buildings”, Report about a Pilot Study, Washington University, St. Louis, MO 1967. 

[2 ] Dehlinger, Hans:  “Deontische Fragen, Urteilsbildung, Bewertungssysteme”  in “Die Methodische Bewertung: Ein Instrument des Architekten”: Festschrift zu Prof. Musso’s 65. Geburtstag. Technische Universität Berlin 1993.

[3]  Alexander, C. et al. “A Pattern Language”  Oxford University Press, New York 1977. 

[4]  Mann, T. “Built Environment Value as a Function of Occasion and Image”  Academia.edu    Also::

‘Rigatopia”,  LAP, Lambert Academic Publishing, Saarbrücken  2015

[5] Musso, Arne “Planungsmodelle in der Architektur”  Technische Universität Berlin, Fachgebiet Planungsmethoden: (Berlin 1981)

[6] Höfler, Horst: “Problem-darstellung und Problem-lösung in der Bauplanung”, IGMA-Dissertationen 3, 

Universität Stuttgart 1972.

[7] Mann, Thorbjoern:  “The Fog Island Argument”  Xlibris, 2009. Also: “The Structure and Evaluation of Planning Arguments” in INFORMAL LOGIC,  Dec. 2010.

EVALUATION IN THE PLANNING DISCOURSE — AI SUPPORT OF EVALUATION IN PLANNING

Part of a series of  issues to clarify the role of deliberative evaluation in the planning and policy-making process. Thorbjørn Mann, February 2020.

The necessity of information technology assistance

A planning discourse support platform aiming at accommodating projects that cannot be handled by small F2F ‘teams’ or deliberation bodies, must use current (or yet-to-be developed) advanced information technology, if only just to handle communication. The examination of evaluation tasks in such large project discourse, so far, also has shown that serious, thorough deliberation and evaluation can become so complex that information technology assistance for many tasks will seem unavoidable, whether in form of simple data management or more sophisticated ‘artificial intelligence‘.

So the question arises what role advanced Artificial or Augmented Intelligence tools might play in such a platform. A first cursory examination will begin by surveying the simpler data management (‘house-keeping’) aspects that have no direct bearing on actual ‘intelligence’ or ‘reasoning’ and evaluation in planning thinking, and then exploring possible expansion of the material being assembled and sorted, into the intelligence assistance realm. It will be important to remain alert to the concern of where the line between assistance to human reasoning and substituting machine calculation results for human judgment should be drawn.

‘House-keeping’ tasks

a. File maintenance. A first ‘simple’ data management task will of course be to gather and store the contributions to the discourse, for record-keeping, retrieval and reference. This will apply to all entries, in their ‘verbatim‘ form, most of which will be in conversational language. They may be stored in simple chronological order as they are entered, with date and author information. A separate file will keep track of authors and cross-reference them with entries and other actions. A log of activities may also be needed.

b. ‘Ordered’, or ‘formatted’ files. For a meaningfully orchestrated evaluation in the discourse, it will be necessary to check for and eliminate duplication of essential the same information, to sort the entries, for example according to issues, proposals, arguments, factual information, — perhaps already in some formatted manner — and to keep the resulting files updated. This may already involve some formatting of the content of ‘verbatim’ entries.

c.  Preparation of displays, for overview. This will involve displays of ‘candidates’ for
decision, the resulting agenda of accepted candidates; ‘issue maps’ of the evolving discussion, evaluation and decision results and statistics.

d. Preparation of evaluation worksheets.

e. Tabulating, aggregating evaluation results for statistics and displays.

‘Analysis’ tasks, examples

f. Translation. Verbatim entries submitted in different languages and their formatted ‘content’ will have to be translated into the languages of all participants. Also, entries expressed in ‘discipline jargon’ will have to be translated into conversational language.

g. Entries will have to be checked for duplication of essential identical content, expressed in different words (to avoid counting the same content twice in evaluation procedures).

h. Standard information search (‘googling’) for available pertinent information already
documented by existing research, data bases, case studies etc. This will require the selection of search terms, and the assessment of relevance of found items, then entered into as separate section of the ‘verbatim’ file.

i. Entered items (verbal contributions and researched material) will have to be formatted for evaluation; arguments with unstated (‘taken for granted’) premises must be completed with all premises stated explicitly; evaluation aspects, sub-aspects etc must be ordered into coherent ‘aspect trees’.  (Optional: Information claims found in searches may be combined to form ‘new’ arguments that have not been made by human participants).

j. Identifying argument patterns (inference rules) of arguments, and checked (to alert participants for validity problems and contradictions)

k. Normalization of weight assignments, aggregation of judgments and arguments and display if different aggregation result (different aggregation functions) as well as their effect on different decision criteria will have to be prepared and displayed.

l. More sophisticated support examples would be the development of systems models of the ‘system’ at hand, (for example, constructing cause-effect connections and loops for the factual-instrumental premises in arguments) to predict performance of proposed solutions, to simulate the behavior of the resulting system in its environment over time.

The boundary between human and machine judgments

It should be clear from preceding sections that general algorithms should not be used to generate evaluative judgments (unless there are criteria expressed in regulations, laws, or norms, to expressly substitute for human judgment.) Any calculated statistics of participant judgments should be clearly identified as ‘statistics’ of individuals’ judgments, not as ‘group judgments’. The boundary issue may be illustrated with the examination of the idea of complete ‘objectification’ or explanation of a person’s basis of judgment, with the ‘formal evaluation’ process explained in that segment. Complete description of judgment basis would require description of criterion functions for all aspect judgments, the weighting of all aspects and sub-aspects etc., and the estimates of plausibility (probability) for a plan to meet the performance expectations involved. This would allow a person A to make judgments on behalf of another person B, while not necessarily sharing B’s basis of judgment. Imagining a computer doing the same thing is meaningful only if all those values of B’s judgment basis can be given to the computer. The judgments would then be ‘deliberated’ and fully explained (not necessarily justified or mandatory for all to share).

In practice, doing that even for another person is too cumbersome to be realistic. People usually shortcut such complete objectification, making decisions with ‘offhand’ intuitive judgments — that they do not or cannot explain. That step cannot be performed by a machine, by definition: the machine must base its simulation of our judgment basis on some explanation. (Admittedly, It could be simulating the human equivalent of tossing a coin: randomly, though most humans would resent describing their intuitive judgments to be called ‘random’). And vague reference is usually made to ‘common sense’ or otherwise societally accepted values, obscuring and sidestepping the problem of dealing with the reality of significantly different values and opinions.

Where would the machine get the information for making such judgments if not from a human? Any algorithm for this would be written by a human programmer, including the specifics for obtaining the ‘factual’ information needed to develop even the most crude criterion function. A common AI argument would be that the machine can be designed to observe (gather the needed factual information) and ‘learn’ to assemble a basis of judgment, for measurable and predictable objectives such as ‘growth’ or stability (survival) of the system. The trouble is that the ‘facts’ involved in evaluating the performance and advisability of plans are not ‘facts’ at all:  They are estimates, predictions of future facts, so they cannot be ‘observed’ but must be extrapolated from past observations by means of some program. And we can deceive ourselves to accept information about the desirability of ‘ought’ or ‘goodness aspects of a plan as ‘factual’ data only by looking at statistics, (also extrapolated into the future) or legal requirements — that must have been adopted by some human agent or agency.

To be sure: these observations are not intended to dismiss the usefulness of AI (that should be called augmented intelligence) for the planning discourse. They are trying to call attention to the question of where to draw the boundary between human and machine ‘judgment’. Ignoring this issue can easily lead to development of processes in which machine ‘judgment’ — presented to the public as non-partisan, ‘objective’, and therefore more ‘correct’ than human decisions, but inevitably programmed to represent some party’s intentions and values — can become sources of serious mistakes, and tools of oppression. This brief sketch can only serve as encouragement to more thorough discussion.


— o —

EVALUATION IN THE PLANNING DISCOURSE — THE DIMINISHING PLAUSIBILITY PARADOX

Thorbjørn Mann,  February 2020

THE DIMINISHING PLAUSIBILITY PARADOX

Does thorough deliberation increase or decrease confidence in the decision?

There is a curious effect of careful evaluation and deliberation that may appear paradoxical to people involved in planning decision-making, who expect such efforts to lead to greater certainty and confidence in the validity of their decisions. There are even consulting approaches that derive measures of such confidence from the ‘breadth’ and ‘depth’ achieved in the discourse.

The effect is the observation that with well-intentioned, honest effort to give due consideration and even systematic evaluation  to all concerns — as expressed e.g. by the pros and cons of proposed plans perceived by affected and experienced people, –, the degree of certainty or plausibility for a proposed plan actually seems to decrease, or move towards a central ‘don’t know’ point on a +1 to -1 plausibility scale. Specifically: The more carefully breadth (meaning coverage the entire range of all aspects or concerns) and depth (understood as the thorough examination of the support — evidence and supporting arguments — of the premises of each ‘pro’ and ‘con’ argument) are evaluated, the more the degree of confidence felt by evaluators moves from initial high support (or opposition) towards the central point ‘zero’  on the scale, meaning ‘don’t know; can’t decide’.

This is of course, the opposite of what the advice to ‘carefully evaluate the pros and cons’ seem to promise, and what approaches striving for breadth and depth actually appear to achieve. This creates a suspicion that either the method for measuring the plausibility of all the pros and cons must be faulty, or that the approaches relying on the degree of breadth and depth directly as equivalent to greater support are making mistakes. So it seems necessary to take a closer a look at this apparently counterintuitive phenomenon.

The effect has first been observed in the course of the review for a journal publication of an article on the structure and evaluation of planning arguments [1] — several reviewers pointed out what they thought must be a flawed method of calculation.

Explanation of the effect

The crucial steps of the method (also explained in the section on planning argument assessment) are the following:

– All pro and con arguments are converted from their often incomplete, missing- premises state to the complete pattern explicitly stating all premises, (e.g. “Yes, adopt plan A because 1) A will lead to effect B given conditions C, and 2) B ought to be aimed for, and 3) conditions C will be present”).

– Each participant will assign plausibility judgments to each premise, on the +1 /-1 scale where the +1 stands for complete certainty or plausibility, the -1 for complete certainty that the claim is not true, or totally implausible (in the judgment of the individual participant), and the center point of zero expressing inability to judge”don’t know; can’t decide’. Since in the planning argument, all premises are estimates or expectations of future states — effects of the plan, applicability of the causal rule that connects future effects or ‘consequences’ with actions of the plan, and the desirability or undesirability of those consequences, complete certainty assessments (pl = +1, or -1) for the premises must be considered unreasonable; so all the plausibility values will be somewhere between those extremes.

– Deriving a plausibility value for the entire argument from these plausibility judgments can be done in different ways: The extreme being to assign the lowest premise plausibility judgment prempl to the entire argument, expressing an attitude like ‘the strength of a chain is equal to the strength of its weakest link’. Or the plausibility values can be multiplied:  The Argument plausibility: for argument i 

            Argpl(i) =  (prempl(i,j))  for all premises j of argument i

Either way, the resulting argument plausibility cannot be higher than the premise plausibilities.

– SInce arguments do not carry the same ‘weight’ in determining the overall plausibility judgment, it is necessary to assign some weight factor to each argument plausibility judgment. That weight will depend on the relative importance of the ‘deontic’ (ought) premises; and approximately expressed by assigning each of the deontic claims in all the arguments a weight between zero and +1, such that all the weights add up to +1. So the weight of argument i will be the plausibility of argument i times the weight of its deontic premises: Argw(i) = Argpl(i) x w(i)

– A plausibility value for the entire plan, will have to be calculated from all the argument weights. Again, there are different ways to do that (discussed in the section of aggregation) but an aggregation function such as adding all the argument weights (as derived by the preceding steps) will yield a plan plausibility value on the same scale as the initial premise and argument plausibility judgments. It will also be the result of considering all the arguments, both pro and con; and since the argument weights of arguments considered ‘con’ arguments in the view of individual participants will be subtracted from the summed-up weight of ‘pro’ arguments, it will be nowhere near the complete certainty value of +1 or -1, unless of course the process revealed that there were no arguments carrying any weight at all on the pro or con side. Which is unlikely since e.g. all plans have been conceived from some expectation of generating some benefit, and will carry some cost or effort, etc.

This approach as described thus far can be considered a ‘breadth-only’ assessment, justly so if there is no effort to examine the degree of support of premises. But of course the same reasoning can be applied to any of the premises — to any degree of ‘depth’ as demanded by participants from each other. The effect of overall plan plausibility tending toward the center point of zero (‘don’t know’ or ‘undecided’), compared with initial offhand convincing ‘yes: apply the plan!) or ‘no- reject!’ reactions will be the same — unless there are completely ‘principle’-based or ‘logical or physical ‘impossibility’ considerations, in plans that arguably should not even have reached the stage of collective decision-making.

Explanation of the opposite effect in ‘breadth/depth’ based approaches

So what distinguishes this method from approaches that claim to use degrees of ‘breadth and depth’ deliberation as measures justifying the resulting plan decisions? And in the process, increases the team’s confidence in the ‘rightness’ of their decision?

One obvious difference — that must be considered a definite flaw,– is that the degree of deliberation, measured by the mere number of comments, arguments, of ‘breadth’ or ‘depth’, does not include assessment of the plausibility (positive or negative) of the claims involved, nor of their weights of relative importance. Just having talked about the number of considerations, without that distinction, cannot already be a valid basis for decisions, even if Popper’s advice about the degree of confidence in scientific hypotheses we are entitled to hold is not considered applicable to design and planning. (“We are entitled to tentatively accept a hypothesis to the extent we have given our best effort to test, to refute it, and it has withstood all those tests”…)

Sure, we don’t have ‘tests’ that definitively refute a hypothesis (or ‘null hypothesis’) that we have to apply as best we can, and planning decisions don’t rest or fall on the strength of single arguments or hypotheses. All we have are arguments explaining our expectations, speculations about the future resulting from our planning actions — but we can adapt Popper’s advice to planning: “We can accept a plan as tentatively justified to the extent we have tried our best to expose it to counterarguments (con’s) and have seen that those arguments are either flawed (not sufficiently plausible) or outweighed by the arguments in its favor.”

And if we do this, honestly admitting that we really can’t be very certain about all the claims that go into the arguments, pro or con, and look at how all those uncertainties come together in totaling up the overall plausibility of the plan, the tendency of that plausibility to go towards the center point of the scale looks more reasonable.

Could these consideration be the key to understand why approaches relying on mere breadth and depth measurements may result in increased confidence of the participants in such projects? There are two kinds of extreme situations in which it is likely that even extensive breadth and depth discussions can ignore or marginalize one side or the other of necessary ‘pro’ or ‘con’ arguments.

One is the typical ‘problem-solving’ team assembled for the purpose of developing a ‘solution’ or recommendation. The enthusiasm of the collective creative effort itself (but possibly also the often invoked ‘positive’ thinking, defer judgment so as to not disrupt the creative momentum, as well a the expectation of a ‘consensus’ decision?) may focus the thinking of team members on ‘pro’ arguments, justifying the emerging plan — but neglecting or diverting attention from counterarguments. Finding sufficient good reasons for the plan being enough to make a decision?

An opposite type of situation is the ‘protest’ demonstration, or events arranged for the express purpose of opposing a plan. Disgruntled citizens outraged by how a big project will change their neighborhood: counting up all the damaging effects: Must we not assume that there will be a strong focus on highlighting the plan’s negative effects or potential consequences: assembling a strong enough ‘case’ to reject it? In both cases, there may be considerable and even reasonable deliberation in breadth and depth involved — but also possible bias due to neglect of the other side’s arguments.

Implications of the possibility of decreasing plan plausibility?

So pending some more research into this phenomenon, — if found to be common enough to worry about, — it may be useful to look at what it means: what adjustments to common practice it would suggest, what ‘side-stepping’ stratagems may have evolved due to the mere sentiment that more deliberation might shake any undue, undeserved expectations in a plan. Otherwise, cynical observers might recommend throwing up our arms and leaving the decision to the wisdom of ‘leaders’ of one kind or another, in the extreme to oracle-like devices — artificial intelligence from algorithms whose rationales remain as unintelligible to the lay person as the medieval ‘divine judgment’ validated by mysterious rituals (but otherwise amounting to tossing coins?).

Besides the above-mentioned research into the question, examining common approaches on the consulting market for potential vulnerability to provisions to overplay the tendency would be one first step. For example, adding plausibility assessment to the approaches using depth and breadth criteria would be necessary to make them more meaningful.

The introduction of more citizen participation into the public planning process is an increasingly common move that has been urged — among other undeniable advantages such as getting better information about how problems and the plans proposed to solve them actually affect people — to also make plans more acceptable to the public because the plans then are felt to be more ‘their own’. As such, could this make the process vulnerable to the above first fallacy of overlooking negative features? If so, the same remedy of actually including more systematic evaluation into the process might be considered.

A common temptation by promoters of ‘big’ plans can’t be overlooked: to resort to ‘big’ arguments that are so difficult to evaluate that made-up ‘supporting’ evidence can’t be distinguished from predictions based on better data and analysis (following Machiavelli’s quip about ‘the bigger the lie, the more likely people will buy it’…). Many people already are suggesting that we should return to smaller (local) governance entities that can’t offer big lies.

Again: this issue calls for more research.

[1]   “The Structure and Evaluation of Planning Arguments”  Thorbjoern Mann, INFORMAL LOGIC  Dec. 2010.

— o —

EVALUATION IN THE PLANNING DISCOURSE — PROCEDURAL AGREEMENTS

An effort to clarify the role of deliberative evaluation in the planning and policy-making process.  Thorbjørn Mann,  February 2020

PROCEDURAL AGREEMENTS FOR EVALUATION

The need for procedural agreements

Any group, team or assembly having decided to embark upon a common evaluation / deliberation task aimed at a recommendation or decision about a plan, will have to adopt a set of agreements about the procedure to be followed, explicitly or implicitly. These rules can become quite detailed and complicated. Even the familiar ‘rules of order’ of standard parliamentary procedure, aiming at simple yea/nay decisions on ‘motions’ for the assembly to accept or reject, will become book-length guides (like ‘Robert’s Rules of Order’) that the chairpersons of such processes may have to consult when disputes arise. For simplified versions based on the expected simplicity of ending the discussions with a majority vote, and citizens’ familiarity with basic rules, agreements can even be tacitly taken for granted, without recourse to written guides. However, this no longer applies when the decision-making body engages in more detailed and systematic deliberation aiming at making the decisions more transparently justified by the evaluative judgments made on the comments in the discourse.

General overall agreements versus procedures for ‘special techniques’

This could be seen as a call for a general procedure that includes the necessary procedural rules, as an extension of the familiar parliamentary procedure. Would such a one-size-fits-all solution be appropriate? As the preceding sections of this study show, we now see not only a great variety of different evaluation tasks and context situations, but also a variety of different ‘approaches’ for such processes now on the ‘market’ — especially as they are assisted by new technology. Each one comes with different assumptions about the rules or ‘procedural agreements’ guiding the process. So it seems that the question is less one of developing and adopting one general-purpose pattern, than one of providing a ‘toolkit’ of different approaches that the participants in a planning process could choose from as the task at hand requires. That opportunity-step for choice must be embedded in a general and flexible overall process, than participants either would be familiar with already, or able to easily learn and agree to.

Once a special technique is selected, as decided by the group, its procedural steps and decision rules should then be explicitly agreed upon at the very beginning of the specific process — the more so, the ‘newer’ the approach, tools and techniques — so as to avoid disruption of the actual deliberation by disagreements about procedure later on. Such quibbles could easily become quite destructive and polarizing, and even their in-process resolution can introduce significant bias into the actual assessment work itself. It may be necessary to change some rules, as the participants learn more about the nature of the problem at hand. That process should be governed by rules set out in the initial agreements: A provision such as the ‘Next step’ proposed in the process for the overall planning discourse platform would offer that opportunity. [See ‘PDSS-REVISED’).

This seemingly matter-of-course step can become controversial because different ‘special techniques’ may involve different concepts and corresponding vocabulary to be used: even ‘systems’ approaches of different ‘generations’ are likely to use different labels for essentially the same things, which can result in miscommunication and misunderstanding or worse. New techniques and tools may require different responsibilities, behavior, decision modes, replacing rules still taken for granted: must new agreements be set ‘upfront’ to prevent later conflicts?

The main agreements — possibly different rules for different project types — then will cover the basic procedural steps, the ‘stopping rules’ for deciding when a decision can be said to have been accepted (since one of the key properties of ‘wicked problems’ is that there is nothing in the nature of the problem itself that tell problem-solvers that a solution has been reached and the the work can stop); decision criteria and modes according to which this should be done. For the details of the evaluation part itself, the kinds of judgments and judgment scales will have to be agreed upon, — so that e.g. a judgment score will have the same meaning for all participants. (These issues will be addressed in separate sections).

An argument can be made that efforts should made to preserve consistency between the overall approach and its frame of reference and vocabulary, and any ‘special techniques’ for evaluation within that process along the way.

Doing without cumbersome procedural rules?

There will be attempts to escape procedures felt to be too ‘cumbersome’ or bureaucratic, with an easier route to a decision. Majority voting itself can be seen as such an escape. Even easier are decision criteria such as ‘consent’ — declared, for example, by the chair that there are ‘no more objections’ combined with ‘time’s up’ — which may indicate that the congregation has become exhausted, rather than convinced of the advantages of a proposed plan, or dissuaded from voicing more ‘critical’ questions. But aren’t the conditions leading to ‘consent’ outcomes in some approaches — group size, seating arrangements, sequences of steps and phases — themselves procedural provisions?

Examples of aspects calling for agreements

Examples of different procedural agreements are the above-mentioned ‘rules of order’, the steps for determining the ‘Benefit/Cost Ratio’ of plans; provisions for ‘formal evaluation’ process of the ‘quality’ of a proposed plan or for the evaluation of a set of alternative proposals; agreements needed for evaluating the plausibility of a plan by systematic assessment of argument plausibility; the guides for a ‘Pattern Language’ approach to planning. (Some of these will be described in separate segments).

The procedural agreements cover aspects such as the following:
– The conceptual frame of reference and its vocabulary and corresponding techniques and displays;
– Proper ‘etiquette’ and behavior
The process steps (sequence), participant rights and responsibilities;
Formatting of entries as needed for evaluation;
– For the evaluation tasks: judgment scales and units, the meaning of the scores;
– The aggregation functions to be used to derive overall judgments from partial judgment scores and from individual participant scores to ‘group’ statistics and decision rules;
– Decision criteria and decision modes;
– The stopping rule(s) for the process.

Specific agreements for different evaluation ‘approaches’ and special techniques must then be discussed in the sections describing those methods.


–o–

Eerily erring electioneering?

In the Fog Island Tavern on a dreary day in February:

– You look worried, Bog-Hubert: What’s bugging you today?
– Oh boy. I never thought I’d see Abbé Boulah getting worked up over politics, but let me tell you, Vodçek: this election is getting to him.
– Really? I thought he’d written off this whole voting business long ago, as a totally misguided crutch to bring any political or planning discourse to a meaningful decision?
– Yeah, he keeps working on his schemes to improve that. But you should have heard him this morning — you’d think he’s still training hard for his old pet project to get endurance cussing accepted as a new Olympic discipline —
– So what is it that’s getting riled up on this one now?
– Well, I think he’s mainly disappointed in the candidates’ apparent inability to learn from past mistakes, and to focus on what’s really important. For example, this business about starting to discredit the current front runner, because he’s too, shall we say, unorthodox for the party establishment.
– What’s wrong with that? It’s politics, isn’t it?
– Fulminating stinkbomb-bundles and moccasin-mouth-ridden swamp-weed kudzu tangles: you too, now?
– Oh Bog-Hubert: excellent — you’re shooting for a medal in that sport too?
– By all the overgrown rusty Dodge truck skeletons in my cousin’s front yard: Don’t you, don’t they get it?
– Get what? it’s BAU politics. So, care to explain?
– Well, isn’t it obvious: Rather than tearing each other apart, shouldn’t they try to figure out what it is that makes the frontrunner’s — and the opposition’s message more appealing to those voters they want to convince to vote for them, and come up with a b e t t e r message, a more appealing and convincing vision?? Because that strategy is bound to come back and kick’em in the youknowwhat…
– Hmm. I see what you mean, by Abbe Boulah’s drooping mustache! And It’s giving the opposition free stinkbombs to launch at whoever ends up being the nominee…
– Yeah. And not only that: What if part of the problem is precisely that old habit of the old swamp establishment — of both parties — that those disgruntled voters are getting tired of? And that’s the rusty musket the establishment keeps shooting itself in the foot with?
– I can see why this upsets our friend. The futility of the hope that they’ll ever learn, I mean. Let’s try to get him back to work on those better ways he’s working on…
– I’ll drink to that. Do they make a decent grappa from Sonoma grapes?

— o —