1 The problem of predictive inadequacy for intentionalism

What makes it the case that a demonstrative pronoun such as ‘this’ or ‘that’ refers to a certain object when uttered by a speaker?Footnote 1, Footnote 2 A metasemantics for demonstratives is a theory that answers this question. Intentionalism is a broad family of metasemantics for demonstratives. Here is one formulation of intentionalism: the reference of an uttered demonstrative d is object o only if the speaker intends to refer to o with d.Footnote 3 Some authors take intentions to be the sole determinant of reference (Akerman, 2009; Bach, 1992; Kaplan, 1989b; Perry, 2009; Stokke, 2010), while some do not and further disagree about additional determinants (King, 2014; Reimer, 1992; Speaks, 2016).

Various objections have been levelled against intentionalism, but in this article I focus on the issue of predictive adequacy—i.e. whether intentionalism makes the right predictions about the reference of demonstratives.Footnote 4 Intentionalism is challenged by cases in which, intuitively, the referent is a certain object although the speaker does not intend to refer to this object. Here is the most notorious case of this sort, courtesy of David Kaplan:

Suppose that without turning and looking I point to the place on my wall which has long been occupied by a picture of Rudolf Carnap and I say:

(27) That is a picture of one of the greatest philosophers of the twentieth century.

But unbeknownst to me, someone has replaced my picture of Carnap with one of Spiro Agnew (adapted from Kaplan, 1978, p.239).Footnote 5

Opponents of intentionalism may present the following argument on the basis of this case. The speaker intends to refer to Carnap’s picture, not Agnew’s picture. Yet intuitively the referent of the demonstrative is Agnew’s picture. Therefore intentionalism is false.Footnote 6

Here is a common response to this argument. The argument mistakenly assumes that the speaker of the Carnap-Agnew case intends to refer to Carnap’s picture only. In fact, the speaker intends to refer to Agnew’s picture too. This further intention is usually presented as an intention to refer to the F, where the description ‘the F’ is satisfied by Agnew’s picture. For instance, it seems true that the speaker intends to refer to the picture behind him. The second step of the response consists in claiming that the intention about Agnew’s picture prevails in the determination of reference. This response to the Carnap-Agnew case may be called the ‘multiple intentions’ response.

The multiple intentions response is unfortunately ad hoc. We are conveniently told that the intention about the intuitive referent is the reference-determining one, but no justification is given. What we need is a general theory that predicts the ascendancy of the intention about the intuitive referent in the Carnap-Agnew case. This is where PI intentionalism enters the stage.

2 Structured intentions and PI intentionalism

Several intentionalists go beyond the multiple intentions response (Bach, 1992; King, 2013; Perry, 2009; Reimer, 1992). They first observe that the intention about Carnap’s picture and the intention about Agnew’s picture are part of a common structure of intentions, and that the intention about Agnew’s picture occupies a certain place—which may be called the proximal one—in this structure. They further propose that only proximal intentions determine reference. This is the view I call PI intentionalism.

Let me sketch what I take to be the best version of PI intentionalism. This version anchors itself to a fully general view about intentional action. Among PI intentionalists, only King ties his metasemantics to a general view of this sort (King, 2013).Footnote 7 One might call this general view the doctrine of structured intentions. The doctrine of structured intentions is widely endorsed, not only in philosophy (Bratman, 1990; Mele, 1992; Searle, 1980) but also at the border of philosophy and psychology (Pacherie, 2008).

One starting point for the doctrine of structured intentions is the platitude that we often do something by doing something else. I volunteer by raising my hand. I score a try by grounding the ball behind the line. These descriptions of actions reflect the further platitude that we attain a certain end by employing certain means. According to the doctrine of structured intentions, if an action is fully intentional, then to each level of description of an action corresponds some intention. When I intentionally score a try by grounding the ball, I have an intention corresponding to the ‘score a try’ level of description, and I have an intention corresponding to the ‘grounding the ball’ level of description. Furthermore, an explanatory structure ties these intentions together: I intend to ground the ball because I intend to score a try.Footnote 8 One may call the explanatory intention the distal intention, and call the explained intention the proximal intention.Footnote 9, Footnote 10 Structured intentions are sometimes captured with formulations such as ‘A intends to φ by ψ-ing’—e.g. the student intends to volunteer by raising her hand. The distal intention attaches to φ, and the proximal intention attaches to ψ.

I will sometimes speak of proximal and distal acts, but this is just for convenience: a metaphysics of structured acts is not strictly required. Neither need we assume that the intention corresponding to a level of description of the action φ is the intention to φ—what Bratman (1984) calls ‘the simple view’. All we need is distinct intentions corresponding to different levels of description of an action, and these intentions to be ordered by an explanatory relation.

Here is how PI intentionalists may apply the doctrine of structured intentions to utterances, and more specifically to the Carnap-Agnew case. In the Carnap-Agnew case, the action of the speaker may be described as follows: she expresses a thought about Carnap’s picture by pointing at the picture behind her and uttering ‘That is a picture…’. The speaker’s distal intention attaches to the level of description of the action before ‘by’. PI intentionalists claim that this intention picks out Carnap’s picture. The speaker’s proximal intention attaches to the level of description of the action after ‘by’. PI intentionalists claim that this intention picks out the picture behind the speaker, i.e. Agnew’s picture.

Once anchored to the doctrine of structured intentions, PI intentionalism seems to offer the most principled intentionalist response to the Carnap-Agnew case. First, the ascendancy of the intention about Agnew’s picture is motivated by the general ascendancy of proximal intentions. Secondly, the proximal status granted to the intention about Agnew’s picture is motivated by a general view connecting descriptions of action and intentions. I take this ‘anchored’ version of PI intentionalism as my target at the start of the next section. I take on ‘non-anchored’ versions of PI intentionalism later in Sect. 3.3.

3 The predictive inadequacy of PI intentionalism

3.1 Ostensive proximal intentions

Let us consider the Carnap-Agnew case once again. Since the speaker intentionally points at the picture behind her, it is intuitively the proximal intention attaching to her pointing gesture that secures the right prediction for PI intentionalism. Let us call proximal intentions attaching to ostensive gestures ostensive intentions. If the PI intentionalist’s take on the Carnap-Agnew case is correct, the ostensive intention of the speaker is about the picture behind her. This ostensive intention is determinate, in the sense that it picks out a unique object. Now, there is a tension between this take and the widely acknowledged view that ostensive gestures are indeterminate (Kaplan, 1989a, 1989b; King, 2014; Reimer, 1992). The Carnap-Agnew case is a case in point: the speaker’s ostensive gesture does not determine Agnew’s picture more than its frame, the nail on which the frame hangs, or the glass screen protecting the picture.

A gap between indeterminate ostensive gestures and determinate ostensive intentions needs to be filled. At this point it seems natural to let the speaker’s beliefs fill this gap. The speaker wants to communicate a thought about Carnap’s picture, and she has beliefs of the form Carnap’s picture is the F. It seems then right to attribute to the speaker the ostensive intention e.g. to point at the F. The speaker believes that Carnap’s picture is the picture on the wall behind her, and so her ostensive intention is to point at the picture on the wall behind her.

This line of thought faces the immediate problem that the speaker might have several beliefs of the form Carnap’s picture is the F, and that some of these beliefs might not target Agnew’s picture. Sure, the speaker believes that Carnap’s picture is the picture on the wall behind her. But she might also believe (truly let’s say) that Carnap’s picture is her ten-year anniversary present. If the former belief fixes the content of the ostensive intention, PI intentionalism makes the right prediction about reference. But if the latter belief does, PI intentionalism makes the wrong prediction. In addition, one cannot arbitrarily stipulate that the former belief trumps the latter belief. Relying on descriptive beliefs to make the content of ostensive intentions determinate thus leads to problems similar to those afflicting descriptivist metasemantics for proper names. This has been noted by Speaks (2017, p. 731) and Devitt (2022, pp. 1000–1001). One faces a double threat of misdescription and arbitrariness: some of the speaker’s beliefs denote another object than the intuitive referent, and one cannot arbitrarily stipulate that the beliefs denoting the intuitive referent are the content-fixing ones.Footnote 11

There is an intuitive way out of this problem. The speaker’s belief that Carnap’s picture is the picture on the wall behind her is intuitively relevant to the pointing gesture to which her ostensive intention attaches. By contrast, her belief that Carnap’s picture is her ten-year anniversary present is intuitively irrelevant to her pointing gesture. Can we make good on this intuitive contrast? Reimer writes: “The relevant beliefs will be those that connect the intended demonstratum (the object of the primary [i.e. distal] intention) with the demonstrative act” (Reimer, 1992, p. 390). Reimer does not say what the nature of this connection is, but one can extract from King’s work the idea that the connection is explanatory (King, 2013). The beliefs that fix the content of ostensive intentions are those that in some intuitive sense explain the speaker’s ostensive act. In the Carnap-Agnew case, the speaker points as she does because she believes that Carnap’s picture is the picture on the wall behind her. This explanatory belief fixes the content of the ostensive intention, which turns out to be determinate and to pick out Agnew’s picture. Or so the story goes.

There is something wrong with this story. Why does the speaker of the Carnap-Agnew case make the ostensive gesture that she makes? Well, she points behind her because she believes that Carnap’s picture is on the wall behind her. The explanatory belief of her ostensive act is then really a belief about the location of Carnap’s picture. And this location belief does not target Agnew’s picture more than its frame, than the nail on which it hangs, etc. Sure, Agnew’s picture is on the wall behind the speaker. But so are the frame, the nail, etc. Explanatory beliefs of ostensive acts cannot buy us determinate ostensive intentions.

The story presented two paragraphs ago tries to conceal this by smuggling additional properties into the explanatory belief—e.g. the F on the wall behind the speaker. The reality is that beliefs other than the explanatory belief must be recruited to yield an ostensive intention to point at the F on the wall behind the speaker. And this brings us back to the double threat of misdescription and arbitrariness. Suppose that the speaker believes that Carnap’s picture is a painting, and further believes that it is the painting on the wall behind hercall this further belief B1. Now suppose that Agnew’s picture is not a painting, but a photograph. If belief B1 is allowed to fix the content of the speaker’s ostensive intention, this ostensive intention does not pick out Agnew’s picture. That’s the misdescription problem. Of course, the speaker also believes that Carnap’s picture is the picture behind her—call this belief B2. But why should B2 (rather than B1) fix the content of the speaker’s ostensive intention? That’s the arbitrariness problem.

We have tried to escape the problem of misdescription and arbitrariness for ostensive intentions by appealing to explanatory beliefs, but this appeal has only led us back to it. Here is the problem in its most general form: if the gap between indeterminate ostensive acts and determinate ostensive intentions is filled by descriptive beliefs, then PI intentionalism faces a double threat of misdescription and arbitrariness. Now, one can try to reject the antecedent of this conditional on various grounds. One might first deny that the gap between indeterminate ostensive acts and determinate ostensive intentions is filled by beliefs. The broad alternative is a form of externalism according to which the content of the ostensive intention is fixed by facts beyond the speaker's mental states. I do not know what form such a view could take. I myself cannot think of an externalist mechanism of determination of intention-content which is both independently plausible and guarantees that Agnew's picture is picked out by the speaker’s ostensive intention in the Carnap-Agnew case.

Another option is to grant that beliefs fix the content of ostensive intentions while denying that content-fixing beliefs must be descriptive. Content-fixing beliefs could instead be fully de re, e.g. believing of Carnap’s picture and Agnew’s picture that they are identical. This might then yield a de re ostensive intention about Agnew’s picture—e.g. to point at it. I see no reason to bar de re beliefs from fixing the content of ostensive intentions in general. PI intentionalists are open to this too (King, 2013, p. 301; Perry, 2009 p. 190; Reimer, 1992 pp. 391–392). However, the local consensus on the Carnap-Agnew case seems to be that the speaker cannot have mental states whose content includes Agnew's picture itself. This is presumably because the speaker has never seen Agnew’s picture, and has never even heard of it. None of the relations between thinker and object that are usually regarded as allowing de re thought (perception, memory, communicative chains) holds between the speaker and Agnew’s picture.Footnote 12

Now, there are several views in the literature that allow de re thought in the absence of such relations.Footnote 13 But even among these more liberal views, many do not predict that in the Carnap-Agnew case the speaker thinks de re about Agnew’s picture. To give just one example, Jeshion proposes that an object being significant to an agent, in the sense that it has a considerable impact on her cognitive and affective life, is enough for the agent to think about this object de re (Jeshion, 2010). For instance, an adoptee who fervently hopes that she will one day meet her unknown biological mother thinks of her de re. In the Carnap-Agnew case, Agnew’s picture is not significant to the speaker in the relevant sense, and so Jeshion’s liberal view does not allow the speaker to think of Agnew’s picture de re. As far as I can see, the only theory of de re thought that could allow this is one according to which agents can voluntarily introduce a name-like mental vehicle to think about an object satisfying a certain description, and this voluntary introduction is enough to make the object enter the content of their thought.Footnote 14

Even if this hyper-liberal view is endorsed, it is far from clear that a correct prediction on the Carnap-Agnew case can be reached. Assuming that the speaker can think of Agnew’s picture de re, she suffers from confusion: she takes Carnap’s picture and Agnew’s picture to be one and the same object.Footnote 15 Different positions have been taken on the content of the states of agents who suffer from object-confusion, but none is that only one of the confused objects is the content. One view is that confused de re thoughts are empty (Lawlor, 2007; Recanati, 2012).Footnote 16 Another is that they partially refer to each of the confused objects (Recanati, 2016).Footnote 17 Yet another view is that they have a non-actual object as their content (Milikan 2000; Unnsteinsson, 2019). Overall, PI intentionalists are not in a position to claim that the speaker of the Carnap-Agnew case has a de re ostensive intention about Agnew’s picture. More generally, I conclude that PI intentionalists are not in a position to claim that the speaker has a determinate ostensive intention—either descriptive or de re—about Agnew’s picture.

A concessive response to this problem is crying for consideration. In the Carnap-Agnew case, the proximal level of the speaker’s action does not stop at her ostensive gesture: she also utters a sentence. The intentions attached to the linguistic part of her proximal act have so far been overlooked. One could then tentatively accept that ostensive intentions do not pick out Agnew’s picture on their own, and hope that linguistic proximal intentions get us over the line. Considering linguistic proximal intentions also seems necessary beyond the Carnap-Agnew case. Sometimes the utterance of a demonstrative is not accompanied by an ostensive gesture. If PI intentionalism is to have any chance of accounting for the reference of demonstratives in such cases, one had better look at linguistic proximal intentions.

3.2 Linguistic proximal intentions

How should one conceive of linguistic proximal intentions? I will take an instrumentalist approach to this question: first set out what one would like these intentions to do for PI intentionalism, and then characterise these intentions so that they can do the expected job. Consider the Carnap-Agnew case once again. If one grants that the speaker’ ostensive intention is indeterminate, one may regard its contribution to reference-determination as a mere restriction to objects that are on the wall behind the speaker. One would then like the content of the speaker’s linguistic intention to include e.g. the property of being a picture, so that the overall proximal intention restricts reference to an object that is (i) on the wall behind her and (ii) a picture. This would be enough for PI intentionalism to secure the right prediction about reference. In general, linguistic proximal intentions should be about properties possessed by the intuitive referent of the demonstrative.

To deserve their label, linguistic intentions must be traceable to the linguistic part of the speaker’s intentional action, that is, to her intentional utterance of linguistic expressions. And since linguistic intentions should target properties of the intuitive referent, they should be e.g. intentions to refer to a F rather than e.g. intentions to utter the expression “F”. In the Carnap-Agnew case, the speaker utters the words ‘is a picture’ intending them to have their conventional meaning in English. Here I assume for the sake of argument that this yields an intention to refer to a picture. The speaker intends to refer to a picture, and she intends to refer to something on the wall behind her. Her overall proximal intention determines Agnew’s picture, or so the hope goes.

Unfortunately, this apparatus leads to a renewed double threat of misdescription and arbitrariness. In the Carnap-Agnew case, the speaker utters: ‘That’s a picture of one of the greatest philosophers of the twentieth century’. For the same reason that she intends to refer to a picture, the speaker intends to refer to a picture of one of the greatest philosophers of the twentieth century. But Agnew’s picture does not instantiate the latter property, and so Agnew’s picture is not predicted to be the referent. Of course, one could pick and choose which part of the speaker’s predication enters her linguistic intention (i.e. only ‘picture’ matters), but this would be arbitrary.

The introduction of reference-determining intentions attached to acts of predication has a further damning consequence. If uttering ‘This/that is a F’ comes with an intention to refer to a F, and if this intention determines the reference of the demonstrative in subject position, then it seems impossible to say something false of an object by uttering a sentence of the form ‘This/That is F’. The possibility of saying something false of an object with a sentence containing a demonstrative in subject position depends on the possibility that the demonstrative has a referent which does not satisfy the predicate. But this possibility vanishes if the reference of the demonstrative is determined by the predicate. This consequence is unacceptable. Linguistic intentions associated with the predicative part of a sentence whose subject is a demonstrative do not contribute to the determination of reference.

Why did the contrary ever seem plausible? Well, hearers often use the predicative part of ‘This/That is F’ to determine the reference of the demonstrative, and speakers expect them to do so. The relevant sense of ‘determine the reference’ in the previous sentence is something like ascertain: one could call it the interpretive sense of ‘determine’. The interpretive sense of ‘determine reference’ is distinct from its metaphysical sense, which concerns the facts in virtue of which a demonstrative refers. Only the metaphysical sense of ‘determine reference’ is relevant when it comes to providing a metasemantics for demonstratives. Some authors think that a confusion between the two senses of ‘determine reference’ besets a wide range of metasemantics for demonstratives: the facts that speakers use to determine reference in the interpretive sense are mistaken for the facts that determine reference in the metaphysical sense (Bach, 2001; Neale & Schiffer, 2020). This confusion might explain the misguided attempt to elevate predicative linguistic intentions to determinants of reference.

If predicative linguistic intentions really have no reference-determining power, then PI intentionalism must rely on ostensive intentions to make the right prediction about reference in the Carnap-Agnew case. I argued earlier that ostensive intentions cannot be trusted to do the job. Beyond Carnap-Agnew-type cases, it is hard to see how PI intentionalism can account for cases in which a speaker utters ‘That is F’ without making any ostensive gesture and her demonstrative intuitively refers to a certain object. There is no ostensive act, and thus no ostensive intention. As for the speaker’s linguistic act, the intention associated with uttering the predicate ‘… is F’ has no reference-determining power. There are then no proximal intentions left to determine reference.

3.3 No way out for PI intentionalism

The version of PI intentionalism I have attacked thus far is anchored to the general doctrine of structured intentions. We have just seen that this version cannot secure some predictions about reference. Can this problem be overcome by untying PI intentionalism from the doctrine of structured intentions?

Most PI intentionalists do not tie their view to a general theory of intentional action (Bach, 1992; Perry, 2009; Reimer, 1992). However, they offer more than an ad hoc response to the Carnap-Agnew case: they do not just pick a speaker-intention that happens to denote Agnew’s picture and call it proximal. These PI intentionalists use instead the prima facie acceptability of a characterisation of structured intentions as a guide to proximal (and distal) intentions. This practice constrains the postulation of proximal intentions to some extent. Let me illustrate this point. In the Carnap-Agnew case, it seems true that the speaker intends to refer to Carnap’s picture by referring to the picture behind her. And it seems false that she intends to refer to Carnap’ picture by referring to e.g. her ten-year anniversary present. Hence, the intention to refer to the picture behind her is a good candidate for proximality, while the intention to refer to her ten-year anniversary present is not.

Unfortunately for this version of PI intentionalism, the prima facie acceptability of a characterisation of structured intentions is not discriminating enough a criterion to avoid the double threat of misdescription and arbitrariness. Say that the speaker believes that Carnap’s picture is a painting. And say that Agnew’s picture is in fact a photograph. The following characterisation of the speaker's structured intention seems acceptable: the speaker intends to refer to Carnap’s picture by referring to the painting behind her. But the proximal intention yielded by this characterisation does not target Agnew’s picture. So the threat of misdescription remains. Now, the following characterisation seems equally acceptable: the speaker intends to refer to Carnap’s picture by referring to the picture behind her. And the proximal intention yielded by this characterisation targets Agnew’s picture. But since the two characterisations are equally acceptable, it would be arbitrary to take the latter but not the former as a guide to the speaker’s proximal intention. Arbitrariness lurks again. I conclude that untying PI intentionalism from the general doctrine of structured intentions cannot save PI intentionalism from predictive inadequacy.

4 What metasemantics for demonstratives?

PI intentionalism joins a growing list of failed metasemantics for demonstratives. This list includes a version of intentionalism we might call de re thought intentionalism, according to which a demonstrative refers to o only if o is the object of de re thought the speaker intends to communicate. De re thought intentionalism is predictively inadequate, since it falls prey to Carnap-Agnew-type cases. The same goes for the non-intentionalist yet closely related view that the reference of a demonstrative is the object of de re thought the speaker expresses (Devitt, 2022).Footnote 18

The list of failed metasemantics also includes a family of theories according to which the contextual cues available to the audience determine the reference of demonstratives. Different views may be taken about the range of reference-determining contextual cues. According to a restrictive conception of these cues, only ostensive gestures qualify (McGinn, 1981). This theory faces two problems mentioned Sect. 3.1: ostensive gestures are indeterminate, and there are non-ostensive cases of demonstrative reference. According to a less restrictive view, reference-determining cues include the sentence uttered by the speaker in addition to ostensive gestures. Reference-determining cues then coincide with the speaker’s proximal act. On the face of it, a proximal act metasemantics cannot do more than a proximal intention metasemantics can do, and I have argued that the latter fails. Finally, according to a liberal conception of reference-determining contextual cues, these may include any fact that an ideal (e.g. competent and attentive) interpreter would use to ascertain reference (Wettstein, 1984). This view may be equivalent to the view that the referent of a demonstrative is the salient (or most salient) object in the context of utterance. Heck (2014, pp. 336–343) has argued at length against this kind of view, and to my mind has done so decisively.Footnote 19

Inspired by this liberal contextual cues metasemantics and King’s recent work (King 2013, 2014), one may propose that the referent of a demonstrative is just the object that an ideal interpreter would take to be intended by the speaker. However, Speaks (2016) and Nowak and Michaelson (2021) have argued that metasemantics of this sort fail because no single characterisation of the ideal interpreter yields correct predictions about reference in every case.

Where to next? One option is to keep looking for another, better metasemantics, without questioning the assumptions brought into metasemantic theorising. Another option is to identify these assumptions, investigate whether some of them can be revised, and see whether these revisions free up logical space for old and new metasemantics. Let me make explicit two desiderata that have implicitly guided us in this article:

  1. 1.

    A metasemantics for demonstratives must make predictions about reference that match pre-theoretical say-judgements on cases (e.g. the judgement that the speaker said something about object o).

  2. 2.

    A metasemantics for demonstratives must provide individually necessary and jointly sufficient conditions.

Old or new metasemantics may be pursued, depending on which desideratum is rejected.Footnote 20 Each desideratum, and its corresponding rejection, should be independently assessed. This is not a task for today. However, the lesson I want to draw from the failure of yet another metasemantics in the form of PI intentionalism is that we should at least consider revising the assumptions that have accompanied most metasemantic theorising so far.Footnote 21