đ How to get to the shadow side: a problem with back-chaining
âThe conscious mind knows nothing beyond the opposites and, as a result, has no knowledge of the thing that unites them. For this reason all uniting symbols have a redemptive significance.â C.G. Jung, Archetypes & the Collective Unconscious, para. 285
This kind of question,
⊠and what were the causal influences on that? âŠ
⊠and what led to / contributed to that happening? âŠ
⊠what were the main reasons for that ⊠?
repeated successively, is part of what we can call âcausal back-chainingâ.
It is a core part of the QuIP data-gathering approach, and now it is also a key part of how the StorySurvey âquestion botâ works.
QuIP researchers are trained to do more than simply repeat this question and then stop. Even the StorySurvey question bot is more subtle than that. But at its core, this question, with light adaptations, gives us a single iterative step which can help construct an entire causal map.
This causal back-chaining algorithm is not perfect. It canât be: Causality is a vague concept with overlapping demands on it. It is homo sapiensâ way of storing knowledge about which buttons to press but it is also part of our systems of judging, rewarding and punishing. Any algorithm we come up with for eliciting a causal map from a respondent will have flaws. We just have to do the best we can.
One key problem with the algorithm, which we address here, is that it primes us to think of only things which were in a sense positive, which âpushed the consequence factor in the same directionâ. There might be good reasons for this in terms of legal and moral thinking, but if we want to find out more about âthe complex web of causal factors affecting this outcomeâ, itâs simply a strong and unwanted bias which we need to remove.
In a minute, weâll see an example of where the back-chaining algorithm falls down. First, hereâs a closely related example where it seems to work fine.
Concert example
Jo wants to save up money for a concert. She succeeds. What was the cause of her success?
- She saved the money from Uncle Nick
- She saved nearly all her pocket money
- She gave up buying sweets
These are three factors which contribute to a (possibly implicit) numerical intermediate factor (amount of money saved) which is then dichotomised into âsaved enough / didnât save enoughâ.
The third one, giving up buying sweets, seems like a natural answer even though you could say itâs a counterfactual: nothing actually happens (she simply doesnât spend money on sweets) but if she hadnât given them up, she wouldnât have bought the ticket.
Here, back-chaining is successful in implicitly identifying a negative factor buying sweets, which makes the outcome less likely. But it only succeeds in identifying it because it is contained within its opposite, giving up buying sweets, which âpushes the consequence factorâ in the same direction as the others.
Football example
Liverpool and Everton are 0-0 until almost the final whistle, and then Everton get a goal in the last minute. What were the causes of Evertonâs success? The pundits say:
- Everton were more clinical in the final third
- Closing down the Liverpool strikers
- ⊠etc
Whereas, reasonably enough, if Liverpool had got the winning goal, the explanations would probably have been different, for example:
- Liverpool made more chances, and something had to go in
- Greater width
- ⊠etc
Thereâs nothing wrong with this. But we have to be aware that the list of explanations given by the pundits is liable to flip at the last moment, depending on who gets that final goal. If we want to actually understand the causal dynamics of the game which was played, we really need the contents of both lists, the one where Liverpool win in the end, and the one where Everton win in the end, and weâd also like to know how these factors combine.
If we want to understand the game well enough to, say, advise a manager on a future encounter between the same clubs, it would be absurd to just pick one of the pundit lists we just mentioned, and throw the other list away.
One solution: pairs of initial questions
If we say that QuIP and StorySurvey can help understand the web of causal factors surrounding an intervention and/or an outcome, we have to do more than simple back-chaining, because we have to get at those negative factors too.
One way we can do this is, say, ask a pair of initial questions:
- What were the best moments of Liverpoolâs play, and what factors caused them, and what influenced those factors âŠ
- What were the worst moments of Liverpoolâs play, and what factors caused them, and what influenced those factors âŠ
we can even add:
- What were the best moments of Evertonâs play, and what factors caused them, and what influenced those factors âŠ
- What were the worst moments of Evertonâs play, and what factors caused them, and what influenced those factors âŠ
This is a useful approach (and itâs easy to do in StorySurvey). Itâs an interesting challenge to then combine, where appropriate, factors which are common to the different sets of answers, including factors in one set which are contrary to factors in another set (e.g. âLiverpool seizing chancesâ and âLiverpool not seizing chancesâ). In particular, in StorySurvey, when respondents are asking question Q, we sometimes also suggest to them existing factors which have emerged within answers to Question Q; should we also suggest to them factors which other respondents already mentioned when answering a different initial question?
Solving the deeper problem
Using pairs of initial questions can really help. But it only solves our problem for the first step, the initial question. For each subsequent step, we get the same problem: asking simply âwhat were the influences on that âŠâ is biased to pick up the positive contributions but not the negative contributions.
There are different kinds of ânegative contributionâ, which overlap.
- Factors which are a drain on a consequence factor (eg when we are thinking of the consequence factor as a linear sum of different influences, like a bank account or a flood level) and pull it âdownwardsâ.
- For example, âlending money to little sisterâ in the Concert example above
- Factors which were a danger to a consequence factor but never actually had a decisive effect on it
- For example, âThis season, Liverpool are good at getting goals in the last five minutesâ in the Football example above
- âŠ
So what can we do?
Can we add a âshadow sideâ to the back-chaining question armoury? Itâs possible to do in individual cases, but is there a general question form which can help us?
Counterfactual worlds
How about this:
You mentioned X. If X had not happened, what would have been the reasons for it not happening?
The trouble with this one is that it opens up a counterfactual world and if we back-chain from hypothetical events we might get even more hypothetical events ⊠We donât want to know about an alternative universe, we want to know about this one.
Best solution?
So my final suggestion is this, our best attempt at an additional question for causal back-chaining which can also catch âopposite influencesâ - drains and dangers.
You mentioned X. Were there any âopposite influencesâ on X: factors which tried to make X not happen, or tried to make the opposite of X happen, or which were a drain on X, or might have blocked X?
Itâs a bit strange, but could it work? Does it work with these typical factors mentioned in a StorySurvey on the response to Covid-19?
Pfizer Vaccine |
Money |
Rapid vaccine rollout |
misinformation |
political motivations |
medical research |
development of vaccines |
countries not sharing data |
countries slow to react |
secrecy surrounding virus |
quick spread |
human deaths |
countries working together |
human nature |
united people |
Early lockdowns |
The cdc eventually said to wear them. |
We have to be careful not to label these drains, blocks and dangers (words which already have a negative sentiment) as ânegative factorsâ (as I did above). This would be very confusing with factors which already have a positive rather than a negative sentiment, like âhuman deathsâ. You wouldnât naturally think of a factor which blocked or reduced human deaths as negative. Thatâs why I suggest âopposite influencesâ.
The important thing is that back-chaining with shadow questions should only surface factors which are/were actually present, and should not open up a parallel universe.
Encoding the results
When we back-chain using this âopposites back-chainingâ question:
You mentioned âLockdown happenedâ: Were there any âopposite influencesâ on that: factors which tried to make it not happen, or tried to make the opposite of it happen, or which were a drain on it, or might have blocked it?
and someone says âpressure from scepticsâ we should could encode this as âPressure from sceptics â> ~Lockdown happenedâ using our âOpposites symbol, ~
â. After that, we can if we wish continue to back-chain from âPressure from scepticsâ in the normal way.
But itâs not quite clear that the Opposites Symbol is the right way to go. Because up to now, with âbarefootâ coding, from any causal claim such as
Source 1: Government intervenedâ> Lockdown happened
we can infer not only the causal link but also the propositions at each end of the arrow:
Source 1: Government intervened
and
Source 1: Lockdown happened
whereas here, it isnât actually the case that our source claims that Early Lockdowns didnât happen.
Conclusion
Can we add this kind of shadow question (sometimes) into the StorySurvey question bot algorithm? What about QuIP?
Finally âŠ
Iâm not arguing here for a variable-based approach as in quantitative causal analysis or systems dynamics. We donât need to think of the factors we identify as quantities which are eternally present within the context and simply vary over time. We are of course still thinking in terms of propositions rather than variables, which means in particular we can embrace narratives, e.g. we can include factors which only emerge half-way through the game (Mo Salah was sent off!) and which spawn a lot of other factors in their wake.
Also, Iâve used the word âblockerâ, which is sometimes used for a third, moderating factor F which changes the influence of a first factor C on a second factor E. But here, âblockerâ simply means a factor C which endangers or drains a factor E, makes it less likely. Moderating factors are still not specifically captured by our algorithm, even when extended as suggest