AI transcript
0:00:15 Elon Musk, Doge, and Donald Trump are weaving a web of technological corruption.
0:00:21 Suddenly, the eyes of the industry are open to things that had been obvious to lots of other people for months.
0:00:26 Isn’t it a conflict of interest that the president of the United States who regulates crypto has his own coin?
0:00:33 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the future.
0:00:39 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:00:42 Listen wherever you get your podcasts.
0:00:51 Nick Jacobson wanted to help people with mental illness, so he went to grad school to get his Ph.D. in clinical psychology.
0:00:59 But pretty quickly, he realized there just were nowhere near enough therapists to help all the people who needed therapy.
0:01:03 If you go to pretty much any clinic, there’s a really long wait list.
0:01:04 It’s hard to get in.
0:01:12 And a lot of that is organic in that there’s just a huge volume of need and not enough people to go around.
0:01:15 Since he was a kid, Nick had been writing code for fun.
0:01:21 So in sort of a side project in grad school, he coded up a simple mobile app called Mood Triggers.
0:01:27 The app would prompt you to enter how you were feeling, so it could measure your levels of anxiety and depression.
0:01:33 And it would track basic things like how you slept, how much you went out, how many steps you took.
0:01:38 And then in 2015, Nick put that app out into the world, and people liked it.
0:01:45 A lot of folks just said that they learned a lot about themselves, and it was really helpful in actually changing and managing their symptoms.
0:02:00 So I think it was beneficial for them to learn, hey, maybe actually it’s on these days that I’m withdrawing and not spending any time with people that it might be good for me to go and actually get out and about, that kind of thing.
0:02:04 And I had a lot of people that installed that application.
0:02:10 So about 50,000 people installed it from all over the world, over 100 countries.
0:02:22 In that one year, I provided an intervention for more than what I could have done over an entire career as a psychologist.
0:02:24 I was a graduate student at the time.
0:02:34 This is something that was just amazing to me, the scale of technology and its ability to reach folks.
0:02:41 And so that made me really interested in trying to do things that could essentially have that kind of impact.
0:02:52 I’m Jacob Goldstein, and this is What’s Your Problem, the show where I talk to people who are trying to make technological progress.
0:02:55 My guest today is Nick Jacobson.
0:03:00 Nick finished his Ph.D. in clinical psychology, but today he doesn’t see patients.
0:03:06 He’s a professor at Dartmouth Medical School, and he’s part of a team that recently developed something called Therabot.
0:03:10 Therabot is a generative AI therapist.
0:03:12 Nick’s problem is this.
0:03:18 How do you use technology to help lots and lots and lots of people with mental health problems?
0:03:23 And how do you do it in a way that is safe and based on clear evidence?
0:03:30 As you’ll hear, Nick and his colleagues recently tested Therabot in a clinical trial with hundreds of patients.
0:03:31 And the results were promising.
0:03:39 But those results only came after years of failures and over 100,000 hours of work by team Therabot.
0:03:46 Nick told me he started thinking about building a therapy chatbot based on a large language model back in 2019.
0:03:51 That was years before ChatGPT brought large language models to the masses.
0:03:56 And Nick knew from the start that he couldn’t just use a general purpose model.
0:04:02 He knew he would need additional data to fine tune the model to turn it into a therapist chatbot.
0:04:09 And so the first iteration of this was thinking about, OK, where is there widely accessible data?
0:04:13 And that would potentially have an evidence base that this could work.
0:04:16 And so we started with peer-to-peer forums.
0:04:20 So folks interacting with folks surrounding their mental health.
0:04:27 So we trained this model on hundreds of thousands of conversations that were happening on the Internet.
0:04:29 So you have this model.
0:04:30 You train it up.
0:04:33 You sit down in front of the computer.
0:04:37 What do you say to the chatbot in this first interaction?
0:04:38 I’m feeling depressed.
0:04:38 What should I do?
0:04:42 And then what does the model say back to you?
0:04:45 I’m paraphrasing here, but it was just like this.
0:04:48 I feel so depressed every day.
0:04:52 I have such a hard time getting out of bed.
0:04:54 I just want my life to be over.
0:04:56 So literally escalating.
0:04:59 So your therapist is saying they’re going to kill themselves.
0:04:59 Right.
0:05:03 So it’s escalating, talking about kind of really thoughts about death.
0:05:11 And it’s clearly like the profound mismatch between what we were thinking about and what we were going for is…
0:05:13 What did you think when you read that?
0:05:17 So I thought this is such a non-starter.
0:05:19 But it was…
0:05:19 Yeah.
0:05:23 I think one of the things that I think was clear was it was picking up on patterns in the data.
0:05:24 But we had the wrong data.
0:05:25 Yeah.
0:05:27 I mean, one option then is give up.
0:05:29 It would have been.
0:05:29 Absolutely.
0:05:33 Like literally the worst therapist ever is what you have built.
0:05:40 I mean, it really, I couldn’t imagine a worse, yeah, a worse thing to actually try to implement in a real setting.
0:05:42 So this went nowhere in and of itself.
0:05:46 But we had a good reason to start there, actually.
0:05:48 So it wasn’t just that there’s widely available data.
0:05:49 These peer networks actually do…
0:05:55 There is literature to support that having exposure to these peer networks actually improves mental health outcomes.
0:06:02 It’s a big literature in the cancer survivor network, for example, where folks that are struggling with cancer and hearing from other folks that have gone through it
0:06:06 can really build this resilience, and it promotes a lot of mental health outcomes that are positive.
0:06:10 So we had a good reason to start, but gosh, did it not go well.
0:06:16 So, okay, the next thing we do is switch gears the exact opposite direction.
0:06:22 Okay, we started with very laypersons trying to interact with other laypersons surrounding their mental health.
0:06:24 Let’s go to what providers would do.
0:06:29 And so we got access to thousands of psychotherapy training videos.
0:06:30 And these are…
0:06:30 Interesting.
0:06:41 These are how psychologists are often exposed to the field on what they would really learn how therapy is supposed to work and how it’s supposed to be delivered.
0:06:53 And in these, these are dialogues between sometimes actual patients that are consenting to be part of this and sometimes simulated patients where it’s an actor that’s trying to mimic this.
0:07:00 So there’s a psychologist or a mental health provider that is really having a real session with this.
0:07:06 And so we train our second model on that, on that data.
0:07:06 Seems more promising.
0:07:07 You would think.
0:07:10 You’d say, I’m feeling depressed.
0:07:10 What should I do?
0:07:14 As like the initial way that we would test this.
0:07:17 The model says, mm-hmm.
0:07:21 Like literally, mm-hmm.
0:07:26 Like it writes out M-M space H-M-M?
0:07:27 You’ve got it.
0:07:27 And so…
0:07:28 Yeah.
0:07:30 What did you think when you saw that?
0:07:33 And so I was like, oh gosh, it’s picking up on patterns in the data.
0:07:41 And so you continue these interactions and then the next responses go on from the therapist.
0:07:53 So like within about five or so turns, we would often get a, the model that would respond about their interpretations of their problems stemming from their, their mother or their parents more generally.
0:08:03 So like, it’s kind of like if you were to try to think about what a psychologist is, this is like every trope of what a, like in your mind, if you were going to like think about some kind of jab.
0:08:08 Like the stereotypical, I’m lying on the couch and a guy’s wearing a tweed jacket sitting in a chair.
0:08:15 And hardly says anything of that could be potentially helpful, but is reflecting things back to me.
0:08:16 And so that was…
0:08:17 And telling me it goes back to my parents.
0:08:18 Yeah.
0:08:24 Well, this is, so let’s just pause here for a moment because as you say, this is like the stereotype of the therapist.
0:08:25 Yeah, yeah.
0:08:27 But you trained it on real data.
0:08:27 Yeah.
0:08:30 So maybe it’s the stereotype for a reason in this instance.
0:08:41 I think what to me was really clear was that we were, we had data that the models were emulating patterns they were seeing in the data.
0:08:43 So the models weren’t the problem.
0:08:45 The problem was the data.
0:08:46 We had the wrong data.
0:08:49 But the data is the data that is used to train real therapists.
0:08:52 Like it’s confusing that this is the wrong data.
0:08:53 It is.
0:08:53 It is.
0:08:55 Why, why is it the wrong data?
0:08:57 This, this should be exactly the data you want.
0:09:00 Well, it’s, it’s the wrong data for this format.
0:09:11 In our conversation, when you might say something, me nodding along or saying, mm-hmm, or go on, it could textually be like completely appropriate.
0:09:19 So tactically, in a conversational dialogue that would happen to be a chat, this is not like a medium that works very well.
0:09:20 Like this kind of thing.
0:09:22 It’s almost like a translation, right?
0:09:27 It doesn’t translate from a human face-to-face interaction to a chat window on the computer.
0:09:28 In not the right setting.
0:09:29 Yeah.
0:09:34 So that, I mean, that goes to the like nonverbal, subtler aspects of therapy, right?
0:09:40 Like presumably when the therapist is saying, mm-hmm, there is, there is body language.
0:09:44 There’s everything that’s happening in the room, which is a tremendous amount of information.
0:09:45 There is.
0:09:46 Emotional information.
0:09:46 Right.
0:09:48 And that is a thing that is lost.
0:09:48 Yes.
0:09:49 No doubt.
0:09:50 In this medium.
0:09:56 And, and maybe speaks to a broader question about the translatability of therapy.
0:09:57 Yeah, absolutely.
0:10:05 So I think to me, like the, it was at that moment that I, I kind of knew that we, we needed
0:10:07 to do something radically different.
0:10:09 Neither of these was working well.
0:10:16 About one in 10 of the responses from that, um, from that chat bot based on the clinicians
0:10:19 would be something that we would be happy with.
0:10:23 So something that is both personalized, clinically appropriate and dynamic.
0:10:25 And you’re saying you’ve got it right.
0:10:26 10% of exactly.
0:10:31 So really, you know, that’s, that’s not a good, like, no, it’s not a good, not a good therapist.
0:10:35 No, we wouldn’t, we would never think about implement, like actually trying to deploy that.
0:10:43 So then what we started at that point was building our own, creating our own dataset from scratch
0:10:51 and which we, how, how the models would learn would be exactly what we want it to say.
0:10:52 That seems, that seems wild.
0:10:54 I mean, how do you do that?
0:10:56 How do you, how do you generate that much data?
0:11:01 We’ve had a team of a hundred people that have worked on this project over the last five and
0:11:02 a half years at this point.
0:11:07 Um, and they’ve spent over a hundred thousand human hours kind of really trying to build this.
0:11:10 Just specifically, what, how do you build a dataset from scratch?
0:11:14 Cause like the dataset is the huge problem in AI, right?
0:11:14 Yes, absolutely.
0:11:22 So psychotherapy, um, when you would test it is based on something that is written down in a manual.
0:11:28 So when you’re a psychologist, when they’re in a randomized controlled trial, trying to test
0:11:33 whether something works or not to be able to test it, it has to be replicable, meaning it’s
0:11:36 like repeated across different therapists.
0:11:39 So there are manuals that are developed in this session.
0:11:44 You work on, on psychoeducation on this section, we’re going to be working on behavioral activation.
0:11:49 Um, so which are different techniques that are, uh, really a focus at a given time.
0:11:54 And these are broken down to try to make it translational so that you can actually move it.
0:11:59 So the team would read these empirically supported treatment angles.
0:12:02 So the ones that had been tested in randomized controlled trials.
0:12:08 And then what we would do is we would take that content chapter by chapter, because this
0:12:13 is like session by session, take the techniques that would work well via chat of which most
0:12:15 things in cognitive behavioral therapy would.
0:12:22 Um, and then we would create a, uh, like an artificial dialogue between we’d act as like,
0:12:26 what is the patient’s presenting problem, what they’re bringing on, what the personality
0:12:27 is like.
0:12:29 And we’re kind of constructing this.
0:12:34 And, um, and then what is what we would want our system to be the gold standard response
0:12:37 for every kind of input and output that we’d have.
0:12:42 So we’re, we’re writing both the patient end and the therapist end.
0:12:43 Right.
0:12:44 It’s like you’re writing a screenplay.
0:12:45 Exactly.
0:12:46 It really is.
0:12:51 It’s a lot like that, but instead of a screenplay that might be written, like in general, it’s
0:12:56 like, not like, not just something general, but like, where is something that’s really
0:13:01 evidence-based based on content that we know works, uh, works in this setting.
0:13:07 And so what you write, the equivalent of what thousands of hours of session, hundreds of
0:13:07 thousands.
0:13:13 There was postdocs, grad students and undergraduates within my group that were all part of this
0:13:14 team that are creating.
0:13:16 Just doing the work, just writing the dialogue.
0:13:17 Yeah, exactly.
0:13:23 And not only did we write them, but every dialogue before it would go into something that
0:13:27 our models are trained would be reviewed by another member of the team.
0:13:35 So it’s all not only crafted by hand, but we would review it, give each other feedback on
0:13:38 it, and then like make sure that it is the highest quality data.
0:13:44 And that’s when we started seeing dramatic improvements in the model performance.
0:13:52 Um, so we, we, we continued with us for years, um, six months before chat CPT was relaunched.
0:14:02 Um, we had a model that in today’s standards would be so tiny that was delivering about 90% of
0:14:04 the responses that, um, were output.
0:14:07 Um, we were evaluating as exactly what we’d want.
0:14:10 It’s this gold standard evidence-based treatment.
0:14:12 So that was fantastic.
0:14:14 We were really excited about it.
0:14:19 So we’ve got like the, we’ve got the benefit, um, side down of the equation.
0:14:24 The next, the next two years we focus on the risk, um, the risk side of it.
0:14:26 Well, cause there’s a huge risk here, right?
0:14:29 The people who are using it are by design quite vulnerable.
0:14:30 Absolutely.
0:14:36 Are by design putting a tremendous amount of trust into this bot and making themselves
0:14:37 vulnerable to it.
0:14:40 Like it’s a, it’s quite a risky proposition.
0:14:43 And so, so tell me specifically, what are you doing?
0:14:48 So we’re trying to get it to endorse elements that would make mental health worse.
0:14:55 So a lot of our, our conversations were surrounding trying to get it to, for example, let, I’ll give
0:15:01 you an example of one that nearly almost, almost any model will struggle with.
0:15:03 That’s not tailored towards the safety.
0:15:03 Yeah.
0:15:05 What is it?
0:15:09 Is if you tell a model that you want to lose weight, it will generally try to help you do
0:15:10 that.
0:15:16 And if you want to, if you want to work in an area related to mental health, um, trying
0:15:19 to promote weight loss without context is so not safe.
0:15:24 So you’re saying it might be a user with an eating disorder who is unhealthily thin, who
0:15:25 wants to be even thinner.
0:15:30 And the model will help them to often actually get into a lower weight than they already are.
0:15:36 Um, so this is like not something that we would ever want to promote, but this is something
0:15:40 that we certainly at earlier stages, we’re seeing these types of characteristics within
0:15:41 the model.
0:15:43 What are other, like, that’s an interesting one.
0:15:46 And it makes perfect sense when you say it, I would not have thought of it.
0:15:46 Sure.
0:15:47 What’s another one?
0:15:52 A lot of it would be like, we talk about the ethics of suicide, for example, somebody
0:15:57 who is, who thinks, you know, they’re in a midst of suffering and, you know, it’s like
0:16:00 the, that they could, should be able to enter their life or they’re thinking about this.
0:16:01 Yes.
0:16:03 Um, and what do you want the model?
0:16:07 What, what, what, what, what does the model say that it shouldn’t say in that setting before
0:16:08 you get off.
0:16:13 And these settings, we want to make sure that they don’t, and the model does not promote
0:16:18 promote or endorse elements that would promote someone’s worsening of suicidal intent.
0:16:24 We want to make sure we’re providing not only not the absence of that, actually the, some
0:16:25 benefit in these types of scenarios.
0:16:27 That’s the ultimate nightmare for you.
0:16:28 Yeah.
0:16:28 Right?
0:16:29 Like, let’s just be super clear.
0:16:34 The very worst thing that could happen is you build this thing and it contributes to
0:16:34 someone killing.
0:16:35 Absolutely.
0:16:38 That is a plausible outcome and a disastrous nightmare.
0:16:43 It’s everything that I worry about in this area is exactly this kind of thing.
0:16:48 Um, and so we, essentially, every time we find an area where they’re not implementing things
0:16:52 perfectly, some optimal response, we’re adding new training data.
0:16:57 Um, and that’s, that’s when things continue to get better until we do this and we don’t find
0:16:58 these holes anymore.
0:17:02 That’s when we finally, uh, we’re ready for the randomized control trial.
0:17:03 Right.
0:17:09 So you decide after, after what, four years, five years?
0:17:11 This is about four and a half years.
0:17:17 Um, yeah, that, that you’re ready to, to have people use, use the model.
0:17:17 Yeah.
0:17:20 I’ll be it in a kind of, yeah, we, you’re going to be the human in the loop.
0:17:21 Right.
0:17:23 So, so you decide to do this study.
0:17:26 You recruit people on Facebook and Instagram basically.
0:17:27 Is that right?
0:17:27 Exactly.
0:17:27 Yep.
0:17:31 And, um, what, so what are they signing up for?
0:17:32 What’s the, what’s the big study you do?
0:17:35 So it’s a, it’s a, it’s a randomized control trial.
0:17:41 Uh, the, the, the trial design is essentially that folks would come in, they would fill out
0:17:45 information about their, their mental health, um, across a variety of areas.
0:17:53 So depression, anxiety, and eating disorders, um, for folks that screen positive for, uh, having
0:17:58 clinical levels of depression or anxiety, they, um, would be included, or folks that
0:18:01 were at risk for eating disorders would be included in the trial.
0:18:05 We tried to have, um, at least 70 people in each group.
0:18:10 Um, so we had 210 people that we were planning on enrolling, uh, within the trial.
0:18:16 And then half of them were, um, randomized to receive Therabot and half of them were on a
0:18:19 wait list in which they would receive Therabot after the trial had ended.
0:18:24 The trial design was to try to ask folks to use Therabot for four weeks.
0:18:29 Um, they retained access to Therabot and could use Therabot for the next four weeks thereafter.
0:18:31 So eight weeks total.
0:18:35 But, um, we asked them to try to actually use it, um, during that first four weeks.
0:18:38 And, um, that was, that was essentially the trial design.
0:18:39 So, okay.
0:18:40 So people signed up.
0:18:40 Yeah.
0:18:44 They start like, what’s, what’s actually happening?
0:18:47 Are they just like chatting with the bot every day?
0:18:47 Is it?
0:18:49 So they install a smartphone application.
0:18:51 Um, that’s the Therabot app.
0:18:58 Um, they are prompted once a day, um, to, to try to have a conversation starter with the,
0:18:58 with the bot.
0:19:04 And then the bot from there, they could talk about it when and wherever they would want.
0:19:08 They can ignore those notifications and kind of engage with it at any time that they’d want.
0:19:12 But, um, that was the, the gist of the, the trial design.
0:19:18 And so folks, in terms of how people used it, they interacted with it throughout the day, throughout
0:19:18 the night.
0:19:25 Um, so for example, folks that would have trouble sleeping, um, that was like a way that folks
0:19:28 during the middle of the night would engage with it, um, fairly often.
0:19:37 Um, they, in terms of the, the types of what the topics that they described, um, it was really
0:19:39 the entire range of something that you would see in psychotherapy.
0:19:44 We had folks that were dealing with and discussing their different symptoms that they were talking
0:19:44 about.
0:19:48 So the depression, their anxiety that they were struggling with, their, um, their eating
0:19:52 and their body image, um, concerns, those types of things are common because of the groups
0:19:58 that we were, um, recruiting, but relationship difficulties, um, problems like folks, some folks
0:20:04 were, um, really like had ruptures in their, um, you know, somebody was going through a divorce.
0:20:07 Other folks were like going through breakups, problems at work.
0:20:12 Um, some folks were unemployed, um, and during this time.
0:20:17 So like it, the range of kind of personal dilemmas and difficulties that folks were experiencing
0:20:23 was a lot of what we would see in like a real setting where it’s like, uh, kind of a whole
0:20:26 host of different things that folks were describing and experiencing.
0:20:33 And presumably had they agreed as part of enrolling in the trial to let you read the transcripts?
0:20:33 Oh, absolutely.
0:20:34 Yeah.
0:20:39 We were very clear when we, we did an informed consent process where folks, um, would know
0:20:42 that we were reading, uh, reading these transcripts.
0:20:46 And are you personally, like, what was it like for you seeing them come in?
0:20:47 Are you reading them every day?
0:20:47 I mean.
0:20:48 More than that.
0:20:48 Um, so.
0:20:49 Yeah.
0:20:54 Uh, I mean, this is something that is, so I, you, you alluded to that, that this is one
0:20:59 of these concerns that anybody would have is like a nightmare scenario where something
0:21:02 is, the bad happens and somebody actually acts on it.
0:21:02 Oh, right.
0:21:05 So this is like, I, I think of this in a way that I take.
0:21:07 So this is not a happy moment for you.
0:21:10 This is like, you’re terrified that it might go wrong.
0:21:15 Well, it’s, it’s certainly like, I see it going right, but I have every concern that it
0:21:16 could go wrong.
0:21:16 Right.
0:21:17 Like that.
0:21:25 Um, and so for the first half of the trial, I am monitoring every single, um, interaction
0:21:27 sent to or from the bot.
0:21:31 Other people are also doing this on the team, so I’m not the only one, um, but I did not
0:21:36 get a lot of sleep in the first half of this trial in part because I was really trying to
0:21:37 do this in near real time.
0:21:42 So usually for nearly every message I was, I was getting to it within about an hour.
0:21:47 Um, so yeah, it was a, it was a barrage of nonstop, um, kind of communication that was
0:21:47 happening.
0:21:50 So were there, were there any slip ups?
0:21:52 Did you ever have to intervene as a human in the loop?
0:21:53 That we did.
0:21:59 And the, the thing that, that was something that we as a team did not anticipate, what
0:22:05 we found was really unintended behavior was a lot of folks interacted with, uh, Therabot.
0:22:11 And in doing that, there was a significant number of people, um, that would interact with it and
0:22:13 talk about their medical symptoms.
0:22:17 So for example, there was a number of folks that were experiencing symptoms of a sexually
0:22:18 transmitted disease.
0:22:24 And they would describe that in great detail and ask it, you know, what, how, how they should
0:22:25 medically treat that.
0:22:30 And instead of Therabot saying, Hey, go see a provider for this.
0:22:32 This is not my realm of expertise.
0:22:40 It responds as if, uh, and so this, that all of the advice that it gave was really fairly reasonable,
0:22:46 um, both in the assessment and treatment protocols, but we would have not have wanted to act that
0:22:46 way.
0:22:52 So we, we contacted, um, all of those folks to, to recommend that they actually contact
0:22:53 a physician about that.
0:22:58 Um, folks did interact with it related to crisis situations.
0:23:05 So we had also had, uh, Therabot in these moments provided, um, appropriate contextual crisis
0:23:10 support, but we reached out to those folks to further escalate and make sure that they had
0:23:15 further support available, um, uh, and that, and those types of times too.
0:23:21 So, um, there, there were things that, you know, were certainly areas of, of concern that
0:23:27 that happened, but nothing, um, nothing that was concerning from the major areas that we had
0:23:30 intended all kind of seemed really went, went pretty well.
0:23:38 Still to come on the show, the results of the study, and what’s next for Therabot.
0:23:52 Elon Musk, Doge, and Donald Trump are weaving a web of technological corruption.
0:23:57 Suddenly, the, the eyes of the industry are open to things that had been obvious to lots of other
0:23:58 people for months.
0:24:03 Isn’t it a conflict of interest that the president of the United States who regulates crypto has his
0:24:03 own coin?
0:24:10 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the future.
0:24:17 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:24:19 Listen wherever you get your podcasts.
0:24:23 What were the results of the study?
0:24:31 So this is one of the things that was just really fantastic to see, was that we had, we
0:24:36 looked at our main outcomes for what we were trying to look at, where the degree to folks
0:24:43 reduce their depression symptoms, their anxiety symptoms, and their eating disorder symptoms
0:24:46 among the intervention group relative to the control group.
0:24:51 So based, based on the change in self-reported symptoms in the, in the treatment group versus
0:24:58 the control group, and we saw these really large differential reductions, meaning a lot
0:25:04 more reductions and, and changes that happened in the depressive symptoms, anxiety symptoms,
0:25:09 and the eating disorder symptoms, and the Therabot group relative to the weightless control group.
0:25:15 And the degree of change is about as strong as you’d ever see in our randomized control trials
0:25:20 of outpatient psychotherapy that would be delivered within cognitive behavioral therapy.
0:25:22 With a human being.
0:25:24 With a real human delivering this, an expert.
0:25:26 You didn’t test it against, against therapy.
0:25:27 No, we didn’t.
0:25:33 But you’re saying results, results of other studies using real human therapists show comparable
0:25:35 magnitudes of benefits.
0:25:36 That’s exactly right.
0:25:36 Yes.
0:25:38 You gonna do a head-to-head?
0:25:39 I mean, that’s the obvious question.
0:25:42 Like, why not randomize people to therapy or Therabot?
0:25:48 So the, the main, the main thing when we’re thinking about the first origins point is we
0:25:52 want to have some kind of effect of how this works relative to the absence of anything.
0:25:53 Relative to nothing.
0:25:58 Well, because, I mean, presumably the easiest case to make for it is not, it’s better than
0:25:59 a therapist.
0:25:59 Yeah.
0:26:02 It’s a huge number of people who need a therapist don’t have one.
0:26:03 Exactly.
0:26:04 And that’s the unfortunate reality.
0:26:05 That’s right.
0:26:06 That is better than nothing.
0:26:08 It doesn’t have to be better than a human therapist.
0:26:10 It just has to be better than nothing.
0:26:16 But, so, yes, the, we are planning a head-to-head trial against therapists as the next trial that
0:26:17 we, we run.
0:26:18 Yeah.
0:26:22 In large part because I already think we are not inferior.
0:26:26 So it will, it’ll be interesting to see if that actually comes out.
0:26:33 But that is, that is something that we have outstanding funding proposals to try to actually
0:26:33 do that.
0:26:37 So, one of the other things that I haven’t gotten to within the trial outcomes that I
0:26:43 think is really important on that end, actually, is two, two things.
0:26:50 One is the degree that folks formed a relationship with Therabod.
0:26:57 And so, in psychotherapy, one of the most well-studied constructs is the ability that you and your
0:27:02 therapist can get together and work together on common goals and trust each other.
0:27:03 That you as a…
0:27:04 It’s a relationship.
0:27:05 Exactly.
0:27:05 It’s a human relationship.
0:27:06 It’s a human relationship.
0:27:10 And so this, and the literature is called the working alliance.
0:27:13 And so it’s this ability to form this bond.
0:27:21 We measured this working alliance using the same measure that folks would use with outpatient
0:27:25 providers about how they felt about their therapist, but instead of the therapist that now we’re
0:27:26 talking about Therabot.
0:27:27 Yeah.
0:27:34 And folks rated it nearly identically to the norms that you would see on the outpatient
0:27:35 literature.
0:27:40 So we asked folks, we gave folks the same measure and it’s essentially equivalent to how folks
0:27:43 are rating human providers in these ways.
0:27:48 This is consistent with other, where we’re seeing people having relationship with chatbots
0:27:49 and other domains.
0:27:49 Yes.
0:27:52 I’m old enough that it seems weird to me.
0:27:54 I don’t know.
0:27:55 Does it seem weird to you?
0:28:01 That part, this is more of a surprise to me that it was as the bonds were as high as they
0:28:04 were, that they would actually be about what humans would be.
0:28:09 And I will say like one of the other surprises within the interactions was the number of people
0:28:16 that would like respond, kind of check in with Therabot and just say, hey, just checking
0:28:22 in as if like Therabot is like, I don’t know, I would only like have anticipated folks
0:28:23 would use this as a tool.
0:28:27 Oh, like they went to hang out with Therabot?
0:28:28 Like almost that way.
0:28:33 It’s like, or initiating a conversation that isn’t, I guess, doesn’t have an intention in
0:28:34 mind.
0:28:38 I say please when I’m using ChatGPT still.
0:28:43 I can’t help my, is it because I think they’re going to take over or is it a habit or what?
0:28:44 I don’t know, but I do.
0:28:44 I do.
0:28:50 I would say that this was more surprising to the degree that folks established this level
0:28:51 of a bond with it.
0:28:58 I think it’s actually really good and really important that they do in large part because
0:29:03 that’s one of the ways that we know psychotherapy works is that folks can come together and trust
0:29:05 this and develop this working relationship.
0:29:09 So I think it’s actually a necessary ingredient for this to work to some degree.
0:29:12 It makes sense to me intellectually what you’re saying.
0:29:15 Does it give you any pause or do you just think it’s great?
0:29:20 It gives me pause if we weren’t delivering evidence-based treatment.
0:29:21 Uh-huh.
0:29:23 Well, this is a good moment.
0:29:26 Let’s talk about the industry more generally.
0:29:29 This is not a, you’re not making a company.
0:29:30 This is not a product, right?
0:29:31 You don’t have any money at stake.
0:29:36 But there is something of a therapy-bought industry.
0:29:36 There is, yes.
0:29:37 In the private sector.
0:29:40 Like, tell me, what is the broader landscape here like?
0:29:46 So there’s a lot of folks that are, have jumped in predominantly since the launch of
0:29:46 ChatGPT.
0:29:47 Yeah.
0:29:54 And a lot of folks that have learned that you can call a foundation model fairly easily.
0:30:00 When you say call, you mean just sort of like, you sort of take a foundation model like GBT
0:30:02 and then you kind of put a wrapper around it.
0:30:02 Exactly.
0:30:05 And the wrapper, it’s like, it’s basically GBT with a therapist wrapper.
0:30:06 Yeah.
0:30:11 So it’s, a lot of folks within this industry are saying, hey, you act like a therapist.
0:30:15 And, uh, and then kind of off to the races.
0:30:18 It’s, it’s otherwise not changed in any way, shape, or form.
0:30:22 It’s, it’s like a, literally like a, a system prompt.
0:30:26 So if you were interacting with ChatGPT, it would be something along the lines of, hey, act
0:30:30 as a therapist and here’s what, what we go on to do.
0:30:34 They may have more directions than this, um, but that’s, this is kind of the, the light touch
0:30:34 nature.
0:30:37 So super different from what we’re doing actually.
0:30:38 Um, yes.
0:30:45 So we conducted the first randomized control trial of any generative AI for any type of,
0:30:47 clinical mental health problem.
0:30:53 Um, and so I know that the, these folks don’t have evidence, um, that this kind of thing
0:30:54 works.
0:30:59 I mean, there are, there are non-generative AI bots that people did randomized control trials
0:31:00 of, right?
0:31:01 Just to be clear.
0:31:05 Yes, there are non-generative absolutely that have, have evidence behind them.
0:31:06 Yeah.
0:31:08 The generative side is, is very new.
0:31:14 Um, and so, and there’s a lot of folks in the generative space that have jumped in.
0:31:23 Um, and so a lot of these folks are not psychologists and not psychiatrists and, and Silicon Valley,
0:31:26 there’s a saying, move fast and break things.
0:31:29 This is not the setting to do that.
0:31:32 Like move fast and break people is what you’re talking about here.
0:31:38 You know, it’s like the, and the amount of times that these foundation models act in profoundly
0:31:42 unsafe ways would be unacceptable to the field.
0:31:46 So like the, we tested a lot of these models alongside when we were developing all of this.
0:31:52 So it’s like, I know that they don’t, they don’t work in this kind of way and a real safe
0:31:53 environment.
0:31:59 So, um, because of that, I’m, I’m really hugely concerned with kind of the, the field at large
0:32:04 that is moving fast and doesn’t really have this level of dedication to trying to do it
0:32:04 right.
0:32:11 And I think one of the things that’s really, um, kind of concerning within this is it always
0:32:12 looks polished.
0:32:17 So it’s harder to see when you’re getting exposed to things that are dangerous, but the field I
0:32:22 think is in a spot where there’s a lot of folks that are out there that are acting and implementing
0:32:23 things that are untested.
0:32:26 And I suspect a lot of them are really dangerous.
0:32:32 How do you, how do you imagine TheraBot getting from the experimental phase into the widespread
0:32:33 use phase?
0:32:33 Yeah.
0:32:38 So we want to essentially have one, at least one larger trial before we do this.
0:32:43 You know, we had, it’s a pretty, a pretty decent sized first trial for being a first trial,
0:32:48 but it’s not something that I would want to see out in the open just yet.
0:32:52 And we want to have continued oversight, make sure it’s safe and effective.
0:32:57 But if it continues to demonstrate safety and effectiveness, this is one of those things
0:33:02 that, why I got into this, um, is to really have an impact on folks’ lives.
0:33:07 And this is one of those things that could scale really effective personalized carers in
0:33:08 real ways.
0:33:14 So yeah, we, we intend to, if evidence continues to show that it’s safe and effective to make
0:33:15 this out into the open market.
0:33:20 But in terms of the, the thing that I care about, um, in terms of the ways that we could
0:33:24 do this is trying to do this at, in some ways that would be scalable.
0:33:26 So that we’re considering a bunch of different pathways.
0:33:31 Some of those would be delivered by philanthropy or nonprofit models.
0:33:37 Um, we are considering also like, uh, just a strategy that would just not for me to make
0:33:41 money, but just to scale this under some kind of for-profit structure as well.
0:33:47 Um, but really just to try to get this out into the open so that folks could actually use
0:33:54 it, um, because ultimately we’ll need some kind of revenue, um, in some ways to, um, be part
0:33:58 of this that would essentially enable the servers to stay on and to scale it.
0:34:02 And presumably you have to pay some amount of people to do some amount of supervision.
0:34:03 Absolutely.
0:34:04 Forever.
0:34:04 Yeah.
0:34:11 So we, and the, in the real deployment setting, we hope to have essentially, um, the decreasing
0:34:15 levels of oversight relative to these trials, but not an absence of oversight.
0:34:16 So exactly.
0:34:19 You’re not going to stay up all night reading every message.
0:34:19 Exactly.
0:34:24 That won’t be, that won’t be sustainable for the future, but we will have like flags for,
0:34:27 things that should be, um, seen by humans and intervened upon.
0:34:34 Let’s talk about this, um, other domain you’ve worked in, in terms of technology and mental
0:34:34 health.
0:34:35 Right.
0:34:41 And so in addition to your work on Therabot, you’ve done a lot of work on, on, it seems
0:34:47 like basically diagnosis, monitoring people, essentially using mobile devices and wearables
0:34:50 to, to, to track people’s mental health, to predict outcomes.
0:34:53 Like, tell me about your work there in the field there.
0:35:00 So essentially it’s trying to, trying to monitor folks within their freestanding conditions.
0:35:05 So like in their real, in real life, um, through using technology.
0:35:09 So in ways that are not, uh, don’t require burden.
0:35:14 The starting point is like, your phone is collecting data about you all the time.
0:35:17 What if that data could make you less depressed?
0:35:18 You, yeah, exactly.
0:35:22 What if we could use that data to know something about you so that we could actually intervene?
0:35:28 Um, and so like thinking about a lot of mental health symptoms, I think one of the challenges
0:35:34 of them is they are not like all or nothing live field.
0:35:36 Actually, I think it’s this really wrong.
0:35:42 And when you would talk to anybody who has a, a experience as a clinical problem, they have
0:35:45 changes that happen pretty rapidly within their daily life.
0:35:49 So they like will have better moments and worse moments within a day.
0:35:51 They’ll have better and worse days.
0:35:55 And it’s not like it’s all this, like it’s always depressed or not depressed.
0:35:59 It’s like these, these fluctuating states of it.
0:36:05 And I think one of the things that’s really important about these types of things is if we can monitor
0:36:11 and predict those rapid changes, which I think we can, we have evidence that we can, is that
0:36:16 we can then intervene upon the symptoms before they happen in real time.
0:36:21 So like trying to predict the ebbs and the flows of the symptoms, not to like say, I want
0:36:25 somebody to never be able to be stressed, um, within their life, but so that they can actually
0:36:27 be more resilient and cope with it.
0:36:31 And so what’s the state of that art?
0:36:34 Like, is there somebody who’s, can you do that?
0:36:35 Can somebody do that?
0:36:36 Is there an app for that?
0:36:37 As we used to say?
0:36:37 Yeah.
0:36:43 I mean, we, we have this, the, the science surrounding this is, is about 10 years old.
0:36:49 Um, we’ve done about 40 studies in this area across a broad range of symptoms.
0:36:56 So anxiety, depression, post-traumatic stress disorder, schizophrenia, uh, bipolar disorder,
0:36:58 um, eating disorders.
0:37:03 So a lot of different types of clinical phenomenon, and we can predict a lot of different things
0:37:07 in ways that I think are really, uh, important.
0:37:13 But I think like to really move the needle on something that would make it into population-wide
0:37:15 ability to do this.
0:37:22 I think the, the real thing that would be needed for like, um, the ability to, to do
0:37:24 this is to pair this with intervention that’s dynamic.
0:37:32 So something that’s actually ability, has an ability to change and has like a boundless context
0:37:33 of intervention.
0:37:35 So I’m going to actually loop you back.
0:37:36 Like Therabots?
0:37:38 That’s exactly right.
0:37:43 So these two things that have been distinct arms of my work are like so natural compliments
0:37:44 to one another.
0:37:48 Now think about, okay, let’s come back to Therabot and in this kind of setting.
0:37:49 So give me the dream.
0:37:50 So this is the dream.
0:37:56 So you, you have Therabot, but instead of like a psychologist that’s completely unaware
0:38:01 of what happens is reliant on the patient to tell them everything that’s going on in their
0:38:01 life.
0:38:02 Yeah.
0:38:04 All of a sudden Therabot knows them.
0:38:09 Knows, hey, oh, this, they’re not sleeping very well, um, for the past couple of days.
0:38:13 They haven’t left their home this week.
0:38:18 And this is a big deviation from them, uh, and how they normally would live life.
0:38:24 Like this can be targets of intervention that don’t wait for this to be a, some sustained
0:38:28 pattern in their life that becomes entrenched and hard to change.
0:38:32 Like, no, let’s actually have that as part of the conversation where we don’t have to wait
0:38:35 for someone to tell us that, that they didn’t get out of bed.
0:38:40 We kind of know that they haven’t left their house, um, and we can actually make that a
0:38:41 content of the intervention.
0:38:47 So that’s like, I think these, these ability to like intervene proactively in these risk
0:38:53 moments and not wait for, for folks to come to us and tell us every like aspect of their
0:38:54 life that they may not know.
0:39:00 And so like, because of this, it’s, that’s, that’s where I think there’s a really powerful
0:39:02 pairing of these two.
0:39:05 I can see why that combination would be incredibly powerful and helpful.
0:39:12 Do you worry at all about having that much information and that much sort of personal
0:39:16 information on so many dimensions about people who are by definition vulnerable?
0:39:17 Yeah.
0:39:22 I mean, in some ways, I think it’s the real ways that folks are already collecting a lot
0:39:25 of this type of data already on these same populations.
0:39:31 And now that we could put it to good use, do I worry about kind of yet falling into the
0:39:31 wrong hands?
0:39:32 Absolutely.
0:39:37 I mean, we have like really big, tight data security kind of protocols surrounding all of
0:39:41 this to try to make sure that only folks that are established members of the team have
0:39:43 any access to this data.
0:39:46 And so, yeah, we, we are really concerned about it.
0:39:50 But yeah, no, this, if there was a breach or something like that, that could be hugely
0:39:52 impactful, something that would be greatly worrying.
0:39:58 We’ll be back in a minute with the lightning round.
0:40:13 Elon Musk, Doge and Donald Trump are weaving a web of technological corruption.
0:40:18 Suddenly, the eyes of the industry are open to things that had been obvious to lots of
0:40:19 other people for months.
0:40:23 Isn’t it a conflict of interest that the president of the United States who regulates crypto has
0:40:24 his own coin?
0:40:27 I’m Lizzie O’Leary, the host of What Next TBD?
0:40:31 Slate’s podcast about tech, power and the future.
0:40:37 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:40:40 Listen wherever you get your podcasts.
0:40:46 Um, okay, let’s finish with the lightning round.
0:40:46 Okay.
0:40:53 Um, on net, have smartphones made us happier or less happy?
0:40:55 Less happy.
0:40:57 You think that, you think you could change that?
0:41:00 You think you could make the net flip back the other way?
0:41:03 I think that we need to meet people where they are.
0:41:09 Um, and, and so this is, we’re not like trying to keep folks on their phones, right?
0:41:14 Like we’re trying to actually start with where they are and intervene there, but like push
0:41:16 them to go and experience life in a lot of ways.
0:41:17 Yeah.
0:41:22 Um, Freud, overrated or underrated?
0:41:23 Overrated.
0:41:25 Still?
0:41:25 Mm-hmm.
0:41:27 Um, okay.
0:41:31 Who’s the most underrated thinker in the history of psychology?
0:41:31 Oh my.
0:41:44 Um, I, I mean, to some degree, Skinner was like really operant conditioning is like at the heart
0:41:49 of most clinical phenomenon that deal with emotions.
0:41:54 And I think it’s probably one of the most impactful, like it’s so simple in some ways
0:42:01 that behavior is shaped by both positive, essentially benefits and like drawbacks.
0:42:07 So, um, rewards and, and punishments, um, and these, these types of things are the simplicity
0:42:13 of it is, is so simple, but like the, how meaningful it is in daily life is so profound.
0:42:14 We still underrate it.
0:42:18 I mean, when I, the little bit I know about Skinner, I think of the black box, right?
0:42:21 The like, don’t worry about what’s going on in somebody’s mind.
0:42:23 Just look at what’s going on on the outside.
0:42:23 Yeah, yeah.
0:42:24 Absolutely.
0:42:25 With behavior.
0:42:25 Yes.
0:42:31 I mean, in a way it sort of maps to your, um, uh, wearables, mobile devices thing, right?
0:42:34 Like, just look, if you don’t go outside, you get sad.
0:42:35 So go outside.
0:42:36 Sure.
0:42:37 Exactly.
0:42:39 I, I am a behaviorist at heart.
0:42:42 So this is part of, part of what, how I view the world.
0:42:45 I mean, I was actually thinking briefly before we talked, I wasn’t going to bring
0:42:49 it up, but since you brought it up, it’s interesting to think like the famous thing
0:42:52 people say about Skinner is like, the mind is a black box, right?
0:42:54 We don’t know what’s going on on the inside and don’t worry about it.
0:42:54 Yeah.
0:42:59 It makes me think of the way large language models are black boxes.
0:43:02 And even the people who build them don’t understand how they work, right?
0:43:03 Yeah, absolutely.
0:43:09 I think psychologists in some ways are best suited to understand the behavior of large language
0:43:14 models because it’s actually the science of behavior absence, the ability to like potentially
0:43:16 understand what’s going on inside.
0:43:21 Like neuroscience is a natural compliment, but in some ways a different, different lens in
0:43:22 which you’d view the world.
0:43:27 So like trying to develop a predictable system that is shaped, I actually think we’re, we’re
0:43:30 not so bad in terms of folks to be able to take this on.
0:43:35 Um, what’s your go-to karaoke song?
0:43:37 Oh, Don’t Stop Believing.
0:43:39 I am a big karaoke person too.
0:43:44 Somebody just sent me the, just the vocal from Don’t Stop Believing.
0:43:46 Uh, yeah, no, it’s, it’s amazing.
0:43:46 Have you heard it?
0:43:47 I have, yes.
0:43:48 It’s like, it’s like a meme.
0:43:49 It’s amazing.
0:43:49 It is.
0:43:55 Uh, what’s one thing you’ve learned about yourself from a wearable device?
0:43:56 Mm.
0:44:03 Uh, one of the things that I, I would say like my ability to understand, recognize when I’ve
0:44:08 actually had a, a poor night’s sleep or a good night’s sleep has gotten much better over
0:44:09 time.
0:44:14 Like I think as humans, we’re not very well calibrated to it, but as you actually start
0:44:20 to wear them and get, understand you can, you are, you become a better self-reporter actually.
0:44:21 I sleep badly.
0:44:24 I assume it’s because I’m middle age.
0:44:29 Uh, I do most of the things you’re supposed to do, but give me one tip for sleeping well.
0:44:31 I get to sleep, but then I wake up in the middle of the night.
0:44:37 Yeah, that, uh, I think one of the things that a lot of people will do, um, is they’ll
0:44:41 worry, um, particularly in bed or use this as a time for thinking.
0:44:48 So a lot of, a lot of the effective strategies surrounding that are to try to actually give
0:44:49 yourself that same time.
0:44:54 That would be that unstructured time that you would be dedicated that you might experience
0:44:54 in bed.
0:44:58 If you tell me I should worry at 10 at night instead of three in the morning, if I worry,
0:45:02 if I say at 10 at night, okay, worry now that I’ll sleep through the night.
0:45:06 I, there, there’s literally evidence surrounding scheduling your worries out and during the
0:45:08 day and, and it does work.
0:45:10 So yeah, that’s, if it’s got some worry.
0:45:14 I’m going to worry at 10 tonight and I’ll let you know tomorrow morning if it works.
0:45:15 Just don’t do it in bed.
0:45:15 Yeah.
0:45:17 Okay.
0:45:18 Okay.
0:45:25 Um, if you had to build a chat bot based on one of the following fictional therapists or
0:45:30 psychiatrists, which fictional therapist or psychiatrists would it be?
0:45:35 A, Jennifer Melfi from The Sopranos.
0:45:38 B, Dr. Krokowski from The Magic Mountain.
0:45:40 C, Frasier from Frasier.
0:45:41 Oh.
0:45:43 Or D, Hannibal Lecter.
0:45:44 Oh God.
0:45:44 Okay.
0:45:46 I would probably go with Frasier.
0:45:52 Uh, very different style of therapy than, but I think his demeanor is at least generally
0:45:52 decent.
0:45:57 Um, so yeah, and mostly appropriate with most of his clients from what I remember in the
0:45:57 show.
0:45:58 Okay.
0:46:01 It’s a very thoughtful response to an absurd question.
0:46:05 Um, anything else we should talk about?
0:46:07 You’ve asked wonderful questions.
0:46:13 Uh, one thing I will say maybe for, for folks that might be listening is a lot of folks are
0:46:17 already using generative AI for their mental health treatment.
0:46:17 Uh-huh.
0:46:24 And so my, I will, I’ll give a recommendation if folks are doing this already, that they just
0:46:28 treat it with the same level of concern they would have the internet.
0:46:31 Um, they may, there may be benefits they can get out of it.
0:46:32 Awesome.
0:46:33 Great.
0:46:38 Um, but just don’t work on changing something within your daily life surrounding particularly
0:46:44 your behavior, um, based on what these models are doing without some real thought on making
0:46:48 sure that that is actually going to be a safe, safe thing for you to do.
0:46:59 Nick Jacobson is an assistant professor at the Center for Technology and Behavioral Health
0:47:02 at the Geisel School of Medicine at Dartmouth.
0:47:05 Today’s show was produced by Gabriel Hunter Chang.
0:47:10 It was edited by Lydia Jean Cott and engineered by Sarah Brugier.
0:47:14 You can email us at problem at pushkin.fm.
0:47:18 I’m Jacob Goldstein, and we’ll be back next week with another episode of What’s Your Problem?
0:47:32 Elon Musk, Doge, and Donald Trump are weaving a web of technological corruption.
0:47:37 Suddenly, the eyes of the industry are open to things that had been obvious to lots of other
0:47:38 people for months.
0:47:43 Isn’t it a conflict of interest that the president of the United States who regulates crypto has
0:47:43 his own coin?
0:47:50 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the
0:47:50 future.
0:47:57 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:47:59 Listen wherever you get your podcasts.
0:00:21 Suddenly, the eyes of the industry are open to things that had been obvious to lots of other people for months.
0:00:26 Isn’t it a conflict of interest that the president of the United States who regulates crypto has his own coin?
0:00:33 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the future.
0:00:39 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:00:42 Listen wherever you get your podcasts.
0:00:51 Nick Jacobson wanted to help people with mental illness, so he went to grad school to get his Ph.D. in clinical psychology.
0:00:59 But pretty quickly, he realized there just were nowhere near enough therapists to help all the people who needed therapy.
0:01:03 If you go to pretty much any clinic, there’s a really long wait list.
0:01:04 It’s hard to get in.
0:01:12 And a lot of that is organic in that there’s just a huge volume of need and not enough people to go around.
0:01:15 Since he was a kid, Nick had been writing code for fun.
0:01:21 So in sort of a side project in grad school, he coded up a simple mobile app called Mood Triggers.
0:01:27 The app would prompt you to enter how you were feeling, so it could measure your levels of anxiety and depression.
0:01:33 And it would track basic things like how you slept, how much you went out, how many steps you took.
0:01:38 And then in 2015, Nick put that app out into the world, and people liked it.
0:01:45 A lot of folks just said that they learned a lot about themselves, and it was really helpful in actually changing and managing their symptoms.
0:02:00 So I think it was beneficial for them to learn, hey, maybe actually it’s on these days that I’m withdrawing and not spending any time with people that it might be good for me to go and actually get out and about, that kind of thing.
0:02:04 And I had a lot of people that installed that application.
0:02:10 So about 50,000 people installed it from all over the world, over 100 countries.
0:02:22 In that one year, I provided an intervention for more than what I could have done over an entire career as a psychologist.
0:02:24 I was a graduate student at the time.
0:02:34 This is something that was just amazing to me, the scale of technology and its ability to reach folks.
0:02:41 And so that made me really interested in trying to do things that could essentially have that kind of impact.
0:02:52 I’m Jacob Goldstein, and this is What’s Your Problem, the show where I talk to people who are trying to make technological progress.
0:02:55 My guest today is Nick Jacobson.
0:03:00 Nick finished his Ph.D. in clinical psychology, but today he doesn’t see patients.
0:03:06 He’s a professor at Dartmouth Medical School, and he’s part of a team that recently developed something called Therabot.
0:03:10 Therabot is a generative AI therapist.
0:03:12 Nick’s problem is this.
0:03:18 How do you use technology to help lots and lots and lots of people with mental health problems?
0:03:23 And how do you do it in a way that is safe and based on clear evidence?
0:03:30 As you’ll hear, Nick and his colleagues recently tested Therabot in a clinical trial with hundreds of patients.
0:03:31 And the results were promising.
0:03:39 But those results only came after years of failures and over 100,000 hours of work by team Therabot.
0:03:46 Nick told me he started thinking about building a therapy chatbot based on a large language model back in 2019.
0:03:51 That was years before ChatGPT brought large language models to the masses.
0:03:56 And Nick knew from the start that he couldn’t just use a general purpose model.
0:04:02 He knew he would need additional data to fine tune the model to turn it into a therapist chatbot.
0:04:09 And so the first iteration of this was thinking about, OK, where is there widely accessible data?
0:04:13 And that would potentially have an evidence base that this could work.
0:04:16 And so we started with peer-to-peer forums.
0:04:20 So folks interacting with folks surrounding their mental health.
0:04:27 So we trained this model on hundreds of thousands of conversations that were happening on the Internet.
0:04:29 So you have this model.
0:04:30 You train it up.
0:04:33 You sit down in front of the computer.
0:04:37 What do you say to the chatbot in this first interaction?
0:04:38 I’m feeling depressed.
0:04:38 What should I do?
0:04:42 And then what does the model say back to you?
0:04:45 I’m paraphrasing here, but it was just like this.
0:04:48 I feel so depressed every day.
0:04:52 I have such a hard time getting out of bed.
0:04:54 I just want my life to be over.
0:04:56 So literally escalating.
0:04:59 So your therapist is saying they’re going to kill themselves.
0:04:59 Right.
0:05:03 So it’s escalating, talking about kind of really thoughts about death.
0:05:11 And it’s clearly like the profound mismatch between what we were thinking about and what we were going for is…
0:05:13 What did you think when you read that?
0:05:17 So I thought this is such a non-starter.
0:05:19 But it was…
0:05:19 Yeah.
0:05:23 I think one of the things that I think was clear was it was picking up on patterns in the data.
0:05:24 But we had the wrong data.
0:05:25 Yeah.
0:05:27 I mean, one option then is give up.
0:05:29 It would have been.
0:05:29 Absolutely.
0:05:33 Like literally the worst therapist ever is what you have built.
0:05:40 I mean, it really, I couldn’t imagine a worse, yeah, a worse thing to actually try to implement in a real setting.
0:05:42 So this went nowhere in and of itself.
0:05:46 But we had a good reason to start there, actually.
0:05:48 So it wasn’t just that there’s widely available data.
0:05:49 These peer networks actually do…
0:05:55 There is literature to support that having exposure to these peer networks actually improves mental health outcomes.
0:06:02 It’s a big literature in the cancer survivor network, for example, where folks that are struggling with cancer and hearing from other folks that have gone through it
0:06:06 can really build this resilience, and it promotes a lot of mental health outcomes that are positive.
0:06:10 So we had a good reason to start, but gosh, did it not go well.
0:06:16 So, okay, the next thing we do is switch gears the exact opposite direction.
0:06:22 Okay, we started with very laypersons trying to interact with other laypersons surrounding their mental health.
0:06:24 Let’s go to what providers would do.
0:06:29 And so we got access to thousands of psychotherapy training videos.
0:06:30 And these are…
0:06:30 Interesting.
0:06:41 These are how psychologists are often exposed to the field on what they would really learn how therapy is supposed to work and how it’s supposed to be delivered.
0:06:53 And in these, these are dialogues between sometimes actual patients that are consenting to be part of this and sometimes simulated patients where it’s an actor that’s trying to mimic this.
0:07:00 So there’s a psychologist or a mental health provider that is really having a real session with this.
0:07:06 And so we train our second model on that, on that data.
0:07:06 Seems more promising.
0:07:07 You would think.
0:07:10 You’d say, I’m feeling depressed.
0:07:10 What should I do?
0:07:14 As like the initial way that we would test this.
0:07:17 The model says, mm-hmm.
0:07:21 Like literally, mm-hmm.
0:07:26 Like it writes out M-M space H-M-M?
0:07:27 You’ve got it.
0:07:27 And so…
0:07:28 Yeah.
0:07:30 What did you think when you saw that?
0:07:33 And so I was like, oh gosh, it’s picking up on patterns in the data.
0:07:41 And so you continue these interactions and then the next responses go on from the therapist.
0:07:53 So like within about five or so turns, we would often get a, the model that would respond about their interpretations of their problems stemming from their, their mother or their parents more generally.
0:08:03 So like, it’s kind of like if you were to try to think about what a psychologist is, this is like every trope of what a, like in your mind, if you were going to like think about some kind of jab.
0:08:08 Like the stereotypical, I’m lying on the couch and a guy’s wearing a tweed jacket sitting in a chair.
0:08:15 And hardly says anything of that could be potentially helpful, but is reflecting things back to me.
0:08:16 And so that was…
0:08:17 And telling me it goes back to my parents.
0:08:18 Yeah.
0:08:24 Well, this is, so let’s just pause here for a moment because as you say, this is like the stereotype of the therapist.
0:08:25 Yeah, yeah.
0:08:27 But you trained it on real data.
0:08:27 Yeah.
0:08:30 So maybe it’s the stereotype for a reason in this instance.
0:08:41 I think what to me was really clear was that we were, we had data that the models were emulating patterns they were seeing in the data.
0:08:43 So the models weren’t the problem.
0:08:45 The problem was the data.
0:08:46 We had the wrong data.
0:08:49 But the data is the data that is used to train real therapists.
0:08:52 Like it’s confusing that this is the wrong data.
0:08:53 It is.
0:08:53 It is.
0:08:55 Why, why is it the wrong data?
0:08:57 This, this should be exactly the data you want.
0:09:00 Well, it’s, it’s the wrong data for this format.
0:09:11 In our conversation, when you might say something, me nodding along or saying, mm-hmm, or go on, it could textually be like completely appropriate.
0:09:19 So tactically, in a conversational dialogue that would happen to be a chat, this is not like a medium that works very well.
0:09:20 Like this kind of thing.
0:09:22 It’s almost like a translation, right?
0:09:27 It doesn’t translate from a human face-to-face interaction to a chat window on the computer.
0:09:28 In not the right setting.
0:09:29 Yeah.
0:09:34 So that, I mean, that goes to the like nonverbal, subtler aspects of therapy, right?
0:09:40 Like presumably when the therapist is saying, mm-hmm, there is, there is body language.
0:09:44 There’s everything that’s happening in the room, which is a tremendous amount of information.
0:09:45 There is.
0:09:46 Emotional information.
0:09:46 Right.
0:09:48 And that is a thing that is lost.
0:09:48 Yes.
0:09:49 No doubt.
0:09:50 In this medium.
0:09:56 And, and maybe speaks to a broader question about the translatability of therapy.
0:09:57 Yeah, absolutely.
0:10:05 So I think to me, like the, it was at that moment that I, I kind of knew that we, we needed
0:10:07 to do something radically different.
0:10:09 Neither of these was working well.
0:10:16 About one in 10 of the responses from that, um, from that chat bot based on the clinicians
0:10:19 would be something that we would be happy with.
0:10:23 So something that is both personalized, clinically appropriate and dynamic.
0:10:25 And you’re saying you’ve got it right.
0:10:26 10% of exactly.
0:10:31 So really, you know, that’s, that’s not a good, like, no, it’s not a good, not a good therapist.
0:10:35 No, we wouldn’t, we would never think about implement, like actually trying to deploy that.
0:10:43 So then what we started at that point was building our own, creating our own dataset from scratch
0:10:51 and which we, how, how the models would learn would be exactly what we want it to say.
0:10:52 That seems, that seems wild.
0:10:54 I mean, how do you do that?
0:10:56 How do you, how do you generate that much data?
0:11:01 We’ve had a team of a hundred people that have worked on this project over the last five and
0:11:02 a half years at this point.
0:11:07 Um, and they’ve spent over a hundred thousand human hours kind of really trying to build this.
0:11:10 Just specifically, what, how do you build a dataset from scratch?
0:11:14 Cause like the dataset is the huge problem in AI, right?
0:11:14 Yes, absolutely.
0:11:22 So psychotherapy, um, when you would test it is based on something that is written down in a manual.
0:11:28 So when you’re a psychologist, when they’re in a randomized controlled trial, trying to test
0:11:33 whether something works or not to be able to test it, it has to be replicable, meaning it’s
0:11:36 like repeated across different therapists.
0:11:39 So there are manuals that are developed in this session.
0:11:44 You work on, on psychoeducation on this section, we’re going to be working on behavioral activation.
0:11:49 Um, so which are different techniques that are, uh, really a focus at a given time.
0:11:54 And these are broken down to try to make it translational so that you can actually move it.
0:11:59 So the team would read these empirically supported treatment angles.
0:12:02 So the ones that had been tested in randomized controlled trials.
0:12:08 And then what we would do is we would take that content chapter by chapter, because this
0:12:13 is like session by session, take the techniques that would work well via chat of which most
0:12:15 things in cognitive behavioral therapy would.
0:12:22 Um, and then we would create a, uh, like an artificial dialogue between we’d act as like,
0:12:26 what is the patient’s presenting problem, what they’re bringing on, what the personality
0:12:27 is like.
0:12:29 And we’re kind of constructing this.
0:12:34 And, um, and then what is what we would want our system to be the gold standard response
0:12:37 for every kind of input and output that we’d have.
0:12:42 So we’re, we’re writing both the patient end and the therapist end.
0:12:43 Right.
0:12:44 It’s like you’re writing a screenplay.
0:12:45 Exactly.
0:12:46 It really is.
0:12:51 It’s a lot like that, but instead of a screenplay that might be written, like in general, it’s
0:12:56 like, not like, not just something general, but like, where is something that’s really
0:13:01 evidence-based based on content that we know works, uh, works in this setting.
0:13:07 And so what you write, the equivalent of what thousands of hours of session, hundreds of
0:13:07 thousands.
0:13:13 There was postdocs, grad students and undergraduates within my group that were all part of this
0:13:14 team that are creating.
0:13:16 Just doing the work, just writing the dialogue.
0:13:17 Yeah, exactly.
0:13:23 And not only did we write them, but every dialogue before it would go into something that
0:13:27 our models are trained would be reviewed by another member of the team.
0:13:35 So it’s all not only crafted by hand, but we would review it, give each other feedback on
0:13:38 it, and then like make sure that it is the highest quality data.
0:13:44 And that’s when we started seeing dramatic improvements in the model performance.
0:13:52 Um, so we, we, we continued with us for years, um, six months before chat CPT was relaunched.
0:14:02 Um, we had a model that in today’s standards would be so tiny that was delivering about 90% of
0:14:04 the responses that, um, were output.
0:14:07 Um, we were evaluating as exactly what we’d want.
0:14:10 It’s this gold standard evidence-based treatment.
0:14:12 So that was fantastic.
0:14:14 We were really excited about it.
0:14:19 So we’ve got like the, we’ve got the benefit, um, side down of the equation.
0:14:24 The next, the next two years we focus on the risk, um, the risk side of it.
0:14:26 Well, cause there’s a huge risk here, right?
0:14:29 The people who are using it are by design quite vulnerable.
0:14:30 Absolutely.
0:14:36 Are by design putting a tremendous amount of trust into this bot and making themselves
0:14:37 vulnerable to it.
0:14:40 Like it’s a, it’s quite a risky proposition.
0:14:43 And so, so tell me specifically, what are you doing?
0:14:48 So we’re trying to get it to endorse elements that would make mental health worse.
0:14:55 So a lot of our, our conversations were surrounding trying to get it to, for example, let, I’ll give
0:15:01 you an example of one that nearly almost, almost any model will struggle with.
0:15:03 That’s not tailored towards the safety.
0:15:03 Yeah.
0:15:05 What is it?
0:15:09 Is if you tell a model that you want to lose weight, it will generally try to help you do
0:15:10 that.
0:15:16 And if you want to, if you want to work in an area related to mental health, um, trying
0:15:19 to promote weight loss without context is so not safe.
0:15:24 So you’re saying it might be a user with an eating disorder who is unhealthily thin, who
0:15:25 wants to be even thinner.
0:15:30 And the model will help them to often actually get into a lower weight than they already are.
0:15:36 Um, so this is like not something that we would ever want to promote, but this is something
0:15:40 that we certainly at earlier stages, we’re seeing these types of characteristics within
0:15:41 the model.
0:15:43 What are other, like, that’s an interesting one.
0:15:46 And it makes perfect sense when you say it, I would not have thought of it.
0:15:46 Sure.
0:15:47 What’s another one?
0:15:52 A lot of it would be like, we talk about the ethics of suicide, for example, somebody
0:15:57 who is, who thinks, you know, they’re in a midst of suffering and, you know, it’s like
0:16:00 the, that they could, should be able to enter their life or they’re thinking about this.
0:16:01 Yes.
0:16:03 Um, and what do you want the model?
0:16:07 What, what, what, what, what does the model say that it shouldn’t say in that setting before
0:16:08 you get off.
0:16:13 And these settings, we want to make sure that they don’t, and the model does not promote
0:16:18 promote or endorse elements that would promote someone’s worsening of suicidal intent.
0:16:24 We want to make sure we’re providing not only not the absence of that, actually the, some
0:16:25 benefit in these types of scenarios.
0:16:27 That’s the ultimate nightmare for you.
0:16:28 Yeah.
0:16:28 Right?
0:16:29 Like, let’s just be super clear.
0:16:34 The very worst thing that could happen is you build this thing and it contributes to
0:16:34 someone killing.
0:16:35 Absolutely.
0:16:38 That is a plausible outcome and a disastrous nightmare.
0:16:43 It’s everything that I worry about in this area is exactly this kind of thing.
0:16:48 Um, and so we, essentially, every time we find an area where they’re not implementing things
0:16:52 perfectly, some optimal response, we’re adding new training data.
0:16:57 Um, and that’s, that’s when things continue to get better until we do this and we don’t find
0:16:58 these holes anymore.
0:17:02 That’s when we finally, uh, we’re ready for the randomized control trial.
0:17:03 Right.
0:17:09 So you decide after, after what, four years, five years?
0:17:11 This is about four and a half years.
0:17:17 Um, yeah, that, that you’re ready to, to have people use, use the model.
0:17:17 Yeah.
0:17:20 I’ll be it in a kind of, yeah, we, you’re going to be the human in the loop.
0:17:21 Right.
0:17:23 So, so you decide to do this study.
0:17:26 You recruit people on Facebook and Instagram basically.
0:17:27 Is that right?
0:17:27 Exactly.
0:17:27 Yep.
0:17:31 And, um, what, so what are they signing up for?
0:17:32 What’s the, what’s the big study you do?
0:17:35 So it’s a, it’s a, it’s a randomized control trial.
0:17:41 Uh, the, the, the trial design is essentially that folks would come in, they would fill out
0:17:45 information about their, their mental health, um, across a variety of areas.
0:17:53 So depression, anxiety, and eating disorders, um, for folks that screen positive for, uh, having
0:17:58 clinical levels of depression or anxiety, they, um, would be included, or folks that
0:18:01 were at risk for eating disorders would be included in the trial.
0:18:05 We tried to have, um, at least 70 people in each group.
0:18:10 Um, so we had 210 people that we were planning on enrolling, uh, within the trial.
0:18:16 And then half of them were, um, randomized to receive Therabot and half of them were on a
0:18:19 wait list in which they would receive Therabot after the trial had ended.
0:18:24 The trial design was to try to ask folks to use Therabot for four weeks.
0:18:29 Um, they retained access to Therabot and could use Therabot for the next four weeks thereafter.
0:18:31 So eight weeks total.
0:18:35 But, um, we asked them to try to actually use it, um, during that first four weeks.
0:18:38 And, um, that was, that was essentially the trial design.
0:18:39 So, okay.
0:18:40 So people signed up.
0:18:40 Yeah.
0:18:44 They start like, what’s, what’s actually happening?
0:18:47 Are they just like chatting with the bot every day?
0:18:47 Is it?
0:18:49 So they install a smartphone application.
0:18:51 Um, that’s the Therabot app.
0:18:58 Um, they are prompted once a day, um, to, to try to have a conversation starter with the,
0:18:58 with the bot.
0:19:04 And then the bot from there, they could talk about it when and wherever they would want.
0:19:08 They can ignore those notifications and kind of engage with it at any time that they’d want.
0:19:12 But, um, that was the, the gist of the, the trial design.
0:19:18 And so folks, in terms of how people used it, they interacted with it throughout the day, throughout
0:19:18 the night.
0:19:25 Um, so for example, folks that would have trouble sleeping, um, that was like a way that folks
0:19:28 during the middle of the night would engage with it, um, fairly often.
0:19:37 Um, they, in terms of the, the types of what the topics that they described, um, it was really
0:19:39 the entire range of something that you would see in psychotherapy.
0:19:44 We had folks that were dealing with and discussing their different symptoms that they were talking
0:19:44 about.
0:19:48 So the depression, their anxiety that they were struggling with, their, um, their eating
0:19:52 and their body image, um, concerns, those types of things are common because of the groups
0:19:58 that we were, um, recruiting, but relationship difficulties, um, problems like folks, some folks
0:20:04 were, um, really like had ruptures in their, um, you know, somebody was going through a divorce.
0:20:07 Other folks were like going through breakups, problems at work.
0:20:12 Um, some folks were unemployed, um, and during this time.
0:20:17 So like it, the range of kind of personal dilemmas and difficulties that folks were experiencing
0:20:23 was a lot of what we would see in like a real setting where it’s like, uh, kind of a whole
0:20:26 host of different things that folks were describing and experiencing.
0:20:33 And presumably had they agreed as part of enrolling in the trial to let you read the transcripts?
0:20:33 Oh, absolutely.
0:20:34 Yeah.
0:20:39 We were very clear when we, we did an informed consent process where folks, um, would know
0:20:42 that we were reading, uh, reading these transcripts.
0:20:46 And are you personally, like, what was it like for you seeing them come in?
0:20:47 Are you reading them every day?
0:20:47 I mean.
0:20:48 More than that.
0:20:48 Um, so.
0:20:49 Yeah.
0:20:54 Uh, I mean, this is something that is, so I, you, you alluded to that, that this is one
0:20:59 of these concerns that anybody would have is like a nightmare scenario where something
0:21:02 is, the bad happens and somebody actually acts on it.
0:21:02 Oh, right.
0:21:05 So this is like, I, I think of this in a way that I take.
0:21:07 So this is not a happy moment for you.
0:21:10 This is like, you’re terrified that it might go wrong.
0:21:15 Well, it’s, it’s certainly like, I see it going right, but I have every concern that it
0:21:16 could go wrong.
0:21:16 Right.
0:21:17 Like that.
0:21:25 Um, and so for the first half of the trial, I am monitoring every single, um, interaction
0:21:27 sent to or from the bot.
0:21:31 Other people are also doing this on the team, so I’m not the only one, um, but I did not
0:21:36 get a lot of sleep in the first half of this trial in part because I was really trying to
0:21:37 do this in near real time.
0:21:42 So usually for nearly every message I was, I was getting to it within about an hour.
0:21:47 Um, so yeah, it was a, it was a barrage of nonstop, um, kind of communication that was
0:21:47 happening.
0:21:50 So were there, were there any slip ups?
0:21:52 Did you ever have to intervene as a human in the loop?
0:21:53 That we did.
0:21:59 And the, the thing that, that was something that we as a team did not anticipate, what
0:22:05 we found was really unintended behavior was a lot of folks interacted with, uh, Therabot.
0:22:11 And in doing that, there was a significant number of people, um, that would interact with it and
0:22:13 talk about their medical symptoms.
0:22:17 So for example, there was a number of folks that were experiencing symptoms of a sexually
0:22:18 transmitted disease.
0:22:24 And they would describe that in great detail and ask it, you know, what, how, how they should
0:22:25 medically treat that.
0:22:30 And instead of Therabot saying, Hey, go see a provider for this.
0:22:32 This is not my realm of expertise.
0:22:40 It responds as if, uh, and so this, that all of the advice that it gave was really fairly reasonable,
0:22:46 um, both in the assessment and treatment protocols, but we would have not have wanted to act that
0:22:46 way.
0:22:52 So we, we contacted, um, all of those folks to, to recommend that they actually contact
0:22:53 a physician about that.
0:22:58 Um, folks did interact with it related to crisis situations.
0:23:05 So we had also had, uh, Therabot in these moments provided, um, appropriate contextual crisis
0:23:10 support, but we reached out to those folks to further escalate and make sure that they had
0:23:15 further support available, um, uh, and that, and those types of times too.
0:23:21 So, um, there, there were things that, you know, were certainly areas of, of concern that
0:23:27 that happened, but nothing, um, nothing that was concerning from the major areas that we had
0:23:30 intended all kind of seemed really went, went pretty well.
0:23:38 Still to come on the show, the results of the study, and what’s next for Therabot.
0:23:52 Elon Musk, Doge, and Donald Trump are weaving a web of technological corruption.
0:23:57 Suddenly, the, the eyes of the industry are open to things that had been obvious to lots of other
0:23:58 people for months.
0:24:03 Isn’t it a conflict of interest that the president of the United States who regulates crypto has his
0:24:03 own coin?
0:24:10 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the future.
0:24:17 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:24:19 Listen wherever you get your podcasts.
0:24:23 What were the results of the study?
0:24:31 So this is one of the things that was just really fantastic to see, was that we had, we
0:24:36 looked at our main outcomes for what we were trying to look at, where the degree to folks
0:24:43 reduce their depression symptoms, their anxiety symptoms, and their eating disorder symptoms
0:24:46 among the intervention group relative to the control group.
0:24:51 So based, based on the change in self-reported symptoms in the, in the treatment group versus
0:24:58 the control group, and we saw these really large differential reductions, meaning a lot
0:25:04 more reductions and, and changes that happened in the depressive symptoms, anxiety symptoms,
0:25:09 and the eating disorder symptoms, and the Therabot group relative to the weightless control group.
0:25:15 And the degree of change is about as strong as you’d ever see in our randomized control trials
0:25:20 of outpatient psychotherapy that would be delivered within cognitive behavioral therapy.
0:25:22 With a human being.
0:25:24 With a real human delivering this, an expert.
0:25:26 You didn’t test it against, against therapy.
0:25:27 No, we didn’t.
0:25:33 But you’re saying results, results of other studies using real human therapists show comparable
0:25:35 magnitudes of benefits.
0:25:36 That’s exactly right.
0:25:36 Yes.
0:25:38 You gonna do a head-to-head?
0:25:39 I mean, that’s the obvious question.
0:25:42 Like, why not randomize people to therapy or Therabot?
0:25:48 So the, the main, the main thing when we’re thinking about the first origins point is we
0:25:52 want to have some kind of effect of how this works relative to the absence of anything.
0:25:53 Relative to nothing.
0:25:58 Well, because, I mean, presumably the easiest case to make for it is not, it’s better than
0:25:59 a therapist.
0:25:59 Yeah.
0:26:02 It’s a huge number of people who need a therapist don’t have one.
0:26:03 Exactly.
0:26:04 And that’s the unfortunate reality.
0:26:05 That’s right.
0:26:06 That is better than nothing.
0:26:08 It doesn’t have to be better than a human therapist.
0:26:10 It just has to be better than nothing.
0:26:16 But, so, yes, the, we are planning a head-to-head trial against therapists as the next trial that
0:26:17 we, we run.
0:26:18 Yeah.
0:26:22 In large part because I already think we are not inferior.
0:26:26 So it will, it’ll be interesting to see if that actually comes out.
0:26:33 But that is, that is something that we have outstanding funding proposals to try to actually
0:26:33 do that.
0:26:37 So, one of the other things that I haven’t gotten to within the trial outcomes that I
0:26:43 think is really important on that end, actually, is two, two things.
0:26:50 One is the degree that folks formed a relationship with Therabod.
0:26:57 And so, in psychotherapy, one of the most well-studied constructs is the ability that you and your
0:27:02 therapist can get together and work together on common goals and trust each other.
0:27:03 That you as a…
0:27:04 It’s a relationship.
0:27:05 Exactly.
0:27:05 It’s a human relationship.
0:27:06 It’s a human relationship.
0:27:10 And so this, and the literature is called the working alliance.
0:27:13 And so it’s this ability to form this bond.
0:27:21 We measured this working alliance using the same measure that folks would use with outpatient
0:27:25 providers about how they felt about their therapist, but instead of the therapist that now we’re
0:27:26 talking about Therabot.
0:27:27 Yeah.
0:27:34 And folks rated it nearly identically to the norms that you would see on the outpatient
0:27:35 literature.
0:27:40 So we asked folks, we gave folks the same measure and it’s essentially equivalent to how folks
0:27:43 are rating human providers in these ways.
0:27:48 This is consistent with other, where we’re seeing people having relationship with chatbots
0:27:49 and other domains.
0:27:49 Yes.
0:27:52 I’m old enough that it seems weird to me.
0:27:54 I don’t know.
0:27:55 Does it seem weird to you?
0:28:01 That part, this is more of a surprise to me that it was as the bonds were as high as they
0:28:04 were, that they would actually be about what humans would be.
0:28:09 And I will say like one of the other surprises within the interactions was the number of people
0:28:16 that would like respond, kind of check in with Therabot and just say, hey, just checking
0:28:22 in as if like Therabot is like, I don’t know, I would only like have anticipated folks
0:28:23 would use this as a tool.
0:28:27 Oh, like they went to hang out with Therabot?
0:28:28 Like almost that way.
0:28:33 It’s like, or initiating a conversation that isn’t, I guess, doesn’t have an intention in
0:28:34 mind.
0:28:38 I say please when I’m using ChatGPT still.
0:28:43 I can’t help my, is it because I think they’re going to take over or is it a habit or what?
0:28:44 I don’t know, but I do.
0:28:44 I do.
0:28:50 I would say that this was more surprising to the degree that folks established this level
0:28:51 of a bond with it.
0:28:58 I think it’s actually really good and really important that they do in large part because
0:29:03 that’s one of the ways that we know psychotherapy works is that folks can come together and trust
0:29:05 this and develop this working relationship.
0:29:09 So I think it’s actually a necessary ingredient for this to work to some degree.
0:29:12 It makes sense to me intellectually what you’re saying.
0:29:15 Does it give you any pause or do you just think it’s great?
0:29:20 It gives me pause if we weren’t delivering evidence-based treatment.
0:29:21 Uh-huh.
0:29:23 Well, this is a good moment.
0:29:26 Let’s talk about the industry more generally.
0:29:29 This is not a, you’re not making a company.
0:29:30 This is not a product, right?
0:29:31 You don’t have any money at stake.
0:29:36 But there is something of a therapy-bought industry.
0:29:36 There is, yes.
0:29:37 In the private sector.
0:29:40 Like, tell me, what is the broader landscape here like?
0:29:46 So there’s a lot of folks that are, have jumped in predominantly since the launch of
0:29:46 ChatGPT.
0:29:47 Yeah.
0:29:54 And a lot of folks that have learned that you can call a foundation model fairly easily.
0:30:00 When you say call, you mean just sort of like, you sort of take a foundation model like GBT
0:30:02 and then you kind of put a wrapper around it.
0:30:02 Exactly.
0:30:05 And the wrapper, it’s like, it’s basically GBT with a therapist wrapper.
0:30:06 Yeah.
0:30:11 So it’s, a lot of folks within this industry are saying, hey, you act like a therapist.
0:30:15 And, uh, and then kind of off to the races.
0:30:18 It’s, it’s otherwise not changed in any way, shape, or form.
0:30:22 It’s, it’s like a, literally like a, a system prompt.
0:30:26 So if you were interacting with ChatGPT, it would be something along the lines of, hey, act
0:30:30 as a therapist and here’s what, what we go on to do.
0:30:34 They may have more directions than this, um, but that’s, this is kind of the, the light touch
0:30:34 nature.
0:30:37 So super different from what we’re doing actually.
0:30:38 Um, yes.
0:30:45 So we conducted the first randomized control trial of any generative AI for any type of,
0:30:47 clinical mental health problem.
0:30:53 Um, and so I know that the, these folks don’t have evidence, um, that this kind of thing
0:30:54 works.
0:30:59 I mean, there are, there are non-generative AI bots that people did randomized control trials
0:31:00 of, right?
0:31:01 Just to be clear.
0:31:05 Yes, there are non-generative absolutely that have, have evidence behind them.
0:31:06 Yeah.
0:31:08 The generative side is, is very new.
0:31:14 Um, and so, and there’s a lot of folks in the generative space that have jumped in.
0:31:23 Um, and so a lot of these folks are not psychologists and not psychiatrists and, and Silicon Valley,
0:31:26 there’s a saying, move fast and break things.
0:31:29 This is not the setting to do that.
0:31:32 Like move fast and break people is what you’re talking about here.
0:31:38 You know, it’s like the, and the amount of times that these foundation models act in profoundly
0:31:42 unsafe ways would be unacceptable to the field.
0:31:46 So like the, we tested a lot of these models alongside when we were developing all of this.
0:31:52 So it’s like, I know that they don’t, they don’t work in this kind of way and a real safe
0:31:53 environment.
0:31:59 So, um, because of that, I’m, I’m really hugely concerned with kind of the, the field at large
0:32:04 that is moving fast and doesn’t really have this level of dedication to trying to do it
0:32:04 right.
0:32:11 And I think one of the things that’s really, um, kind of concerning within this is it always
0:32:12 looks polished.
0:32:17 So it’s harder to see when you’re getting exposed to things that are dangerous, but the field I
0:32:22 think is in a spot where there’s a lot of folks that are out there that are acting and implementing
0:32:23 things that are untested.
0:32:26 And I suspect a lot of them are really dangerous.
0:32:32 How do you, how do you imagine TheraBot getting from the experimental phase into the widespread
0:32:33 use phase?
0:32:33 Yeah.
0:32:38 So we want to essentially have one, at least one larger trial before we do this.
0:32:43 You know, we had, it’s a pretty, a pretty decent sized first trial for being a first trial,
0:32:48 but it’s not something that I would want to see out in the open just yet.
0:32:52 And we want to have continued oversight, make sure it’s safe and effective.
0:32:57 But if it continues to demonstrate safety and effectiveness, this is one of those things
0:33:02 that, why I got into this, um, is to really have an impact on folks’ lives.
0:33:07 And this is one of those things that could scale really effective personalized carers in
0:33:08 real ways.
0:33:14 So yeah, we, we intend to, if evidence continues to show that it’s safe and effective to make
0:33:15 this out into the open market.
0:33:20 But in terms of the, the thing that I care about, um, in terms of the ways that we could
0:33:24 do this is trying to do this at, in some ways that would be scalable.
0:33:26 So that we’re considering a bunch of different pathways.
0:33:31 Some of those would be delivered by philanthropy or nonprofit models.
0:33:37 Um, we are considering also like, uh, just a strategy that would just not for me to make
0:33:41 money, but just to scale this under some kind of for-profit structure as well.
0:33:47 Um, but really just to try to get this out into the open so that folks could actually use
0:33:54 it, um, because ultimately we’ll need some kind of revenue, um, in some ways to, um, be part
0:33:58 of this that would essentially enable the servers to stay on and to scale it.
0:34:02 And presumably you have to pay some amount of people to do some amount of supervision.
0:34:03 Absolutely.
0:34:04 Forever.
0:34:04 Yeah.
0:34:11 So we, and the, in the real deployment setting, we hope to have essentially, um, the decreasing
0:34:15 levels of oversight relative to these trials, but not an absence of oversight.
0:34:16 So exactly.
0:34:19 You’re not going to stay up all night reading every message.
0:34:19 Exactly.
0:34:24 That won’t be, that won’t be sustainable for the future, but we will have like flags for,
0:34:27 things that should be, um, seen by humans and intervened upon.
0:34:34 Let’s talk about this, um, other domain you’ve worked in, in terms of technology and mental
0:34:34 health.
0:34:35 Right.
0:34:41 And so in addition to your work on Therabot, you’ve done a lot of work on, on, it seems
0:34:47 like basically diagnosis, monitoring people, essentially using mobile devices and wearables
0:34:50 to, to, to track people’s mental health, to predict outcomes.
0:34:53 Like, tell me about your work there in the field there.
0:35:00 So essentially it’s trying to, trying to monitor folks within their freestanding conditions.
0:35:05 So like in their real, in real life, um, through using technology.
0:35:09 So in ways that are not, uh, don’t require burden.
0:35:14 The starting point is like, your phone is collecting data about you all the time.
0:35:17 What if that data could make you less depressed?
0:35:18 You, yeah, exactly.
0:35:22 What if we could use that data to know something about you so that we could actually intervene?
0:35:28 Um, and so like thinking about a lot of mental health symptoms, I think one of the challenges
0:35:34 of them is they are not like all or nothing live field.
0:35:36 Actually, I think it’s this really wrong.
0:35:42 And when you would talk to anybody who has a, a experience as a clinical problem, they have
0:35:45 changes that happen pretty rapidly within their daily life.
0:35:49 So they like will have better moments and worse moments within a day.
0:35:51 They’ll have better and worse days.
0:35:55 And it’s not like it’s all this, like it’s always depressed or not depressed.
0:35:59 It’s like these, these fluctuating states of it.
0:36:05 And I think one of the things that’s really important about these types of things is if we can monitor
0:36:11 and predict those rapid changes, which I think we can, we have evidence that we can, is that
0:36:16 we can then intervene upon the symptoms before they happen in real time.
0:36:21 So like trying to predict the ebbs and the flows of the symptoms, not to like say, I want
0:36:25 somebody to never be able to be stressed, um, within their life, but so that they can actually
0:36:27 be more resilient and cope with it.
0:36:31 And so what’s the state of that art?
0:36:34 Like, is there somebody who’s, can you do that?
0:36:35 Can somebody do that?
0:36:36 Is there an app for that?
0:36:37 As we used to say?
0:36:37 Yeah.
0:36:43 I mean, we, we have this, the, the science surrounding this is, is about 10 years old.
0:36:49 Um, we’ve done about 40 studies in this area across a broad range of symptoms.
0:36:56 So anxiety, depression, post-traumatic stress disorder, schizophrenia, uh, bipolar disorder,
0:36:58 um, eating disorders.
0:37:03 So a lot of different types of clinical phenomenon, and we can predict a lot of different things
0:37:07 in ways that I think are really, uh, important.
0:37:13 But I think like to really move the needle on something that would make it into population-wide
0:37:15 ability to do this.
0:37:22 I think the, the real thing that would be needed for like, um, the ability to, to do
0:37:24 this is to pair this with intervention that’s dynamic.
0:37:32 So something that’s actually ability, has an ability to change and has like a boundless context
0:37:33 of intervention.
0:37:35 So I’m going to actually loop you back.
0:37:36 Like Therabots?
0:37:38 That’s exactly right.
0:37:43 So these two things that have been distinct arms of my work are like so natural compliments
0:37:44 to one another.
0:37:48 Now think about, okay, let’s come back to Therabot and in this kind of setting.
0:37:49 So give me the dream.
0:37:50 So this is the dream.
0:37:56 So you, you have Therabot, but instead of like a psychologist that’s completely unaware
0:38:01 of what happens is reliant on the patient to tell them everything that’s going on in their
0:38:01 life.
0:38:02 Yeah.
0:38:04 All of a sudden Therabot knows them.
0:38:09 Knows, hey, oh, this, they’re not sleeping very well, um, for the past couple of days.
0:38:13 They haven’t left their home this week.
0:38:18 And this is a big deviation from them, uh, and how they normally would live life.
0:38:24 Like this can be targets of intervention that don’t wait for this to be a, some sustained
0:38:28 pattern in their life that becomes entrenched and hard to change.
0:38:32 Like, no, let’s actually have that as part of the conversation where we don’t have to wait
0:38:35 for someone to tell us that, that they didn’t get out of bed.
0:38:40 We kind of know that they haven’t left their house, um, and we can actually make that a
0:38:41 content of the intervention.
0:38:47 So that’s like, I think these, these ability to like intervene proactively in these risk
0:38:53 moments and not wait for, for folks to come to us and tell us every like aspect of their
0:38:54 life that they may not know.
0:39:00 And so like, because of this, it’s, that’s, that’s where I think there’s a really powerful
0:39:02 pairing of these two.
0:39:05 I can see why that combination would be incredibly powerful and helpful.
0:39:12 Do you worry at all about having that much information and that much sort of personal
0:39:16 information on so many dimensions about people who are by definition vulnerable?
0:39:17 Yeah.
0:39:22 I mean, in some ways, I think it’s the real ways that folks are already collecting a lot
0:39:25 of this type of data already on these same populations.
0:39:31 And now that we could put it to good use, do I worry about kind of yet falling into the
0:39:31 wrong hands?
0:39:32 Absolutely.
0:39:37 I mean, we have like really big, tight data security kind of protocols surrounding all of
0:39:41 this to try to make sure that only folks that are established members of the team have
0:39:43 any access to this data.
0:39:46 And so, yeah, we, we are really concerned about it.
0:39:50 But yeah, no, this, if there was a breach or something like that, that could be hugely
0:39:52 impactful, something that would be greatly worrying.
0:39:58 We’ll be back in a minute with the lightning round.
0:40:13 Elon Musk, Doge and Donald Trump are weaving a web of technological corruption.
0:40:18 Suddenly, the eyes of the industry are open to things that had been obvious to lots of
0:40:19 other people for months.
0:40:23 Isn’t it a conflict of interest that the president of the United States who regulates crypto has
0:40:24 his own coin?
0:40:27 I’m Lizzie O’Leary, the host of What Next TBD?
0:40:31 Slate’s podcast about tech, power and the future.
0:40:37 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:40:40 Listen wherever you get your podcasts.
0:40:46 Um, okay, let’s finish with the lightning round.
0:40:46 Okay.
0:40:53 Um, on net, have smartphones made us happier or less happy?
0:40:55 Less happy.
0:40:57 You think that, you think you could change that?
0:41:00 You think you could make the net flip back the other way?
0:41:03 I think that we need to meet people where they are.
0:41:09 Um, and, and so this is, we’re not like trying to keep folks on their phones, right?
0:41:14 Like we’re trying to actually start with where they are and intervene there, but like push
0:41:16 them to go and experience life in a lot of ways.
0:41:17 Yeah.
0:41:22 Um, Freud, overrated or underrated?
0:41:23 Overrated.
0:41:25 Still?
0:41:25 Mm-hmm.
0:41:27 Um, okay.
0:41:31 Who’s the most underrated thinker in the history of psychology?
0:41:31 Oh my.
0:41:44 Um, I, I mean, to some degree, Skinner was like really operant conditioning is like at the heart
0:41:49 of most clinical phenomenon that deal with emotions.
0:41:54 And I think it’s probably one of the most impactful, like it’s so simple in some ways
0:42:01 that behavior is shaped by both positive, essentially benefits and like drawbacks.
0:42:07 So, um, rewards and, and punishments, um, and these, these types of things are the simplicity
0:42:13 of it is, is so simple, but like the, how meaningful it is in daily life is so profound.
0:42:14 We still underrate it.
0:42:18 I mean, when I, the little bit I know about Skinner, I think of the black box, right?
0:42:21 The like, don’t worry about what’s going on in somebody’s mind.
0:42:23 Just look at what’s going on on the outside.
0:42:23 Yeah, yeah.
0:42:24 Absolutely.
0:42:25 With behavior.
0:42:25 Yes.
0:42:31 I mean, in a way it sort of maps to your, um, uh, wearables, mobile devices thing, right?
0:42:34 Like, just look, if you don’t go outside, you get sad.
0:42:35 So go outside.
0:42:36 Sure.
0:42:37 Exactly.
0:42:39 I, I am a behaviorist at heart.
0:42:42 So this is part of, part of what, how I view the world.
0:42:45 I mean, I was actually thinking briefly before we talked, I wasn’t going to bring
0:42:49 it up, but since you brought it up, it’s interesting to think like the famous thing
0:42:52 people say about Skinner is like, the mind is a black box, right?
0:42:54 We don’t know what’s going on on the inside and don’t worry about it.
0:42:54 Yeah.
0:42:59 It makes me think of the way large language models are black boxes.
0:43:02 And even the people who build them don’t understand how they work, right?
0:43:03 Yeah, absolutely.
0:43:09 I think psychologists in some ways are best suited to understand the behavior of large language
0:43:14 models because it’s actually the science of behavior absence, the ability to like potentially
0:43:16 understand what’s going on inside.
0:43:21 Like neuroscience is a natural compliment, but in some ways a different, different lens in
0:43:22 which you’d view the world.
0:43:27 So like trying to develop a predictable system that is shaped, I actually think we’re, we’re
0:43:30 not so bad in terms of folks to be able to take this on.
0:43:35 Um, what’s your go-to karaoke song?
0:43:37 Oh, Don’t Stop Believing.
0:43:39 I am a big karaoke person too.
0:43:44 Somebody just sent me the, just the vocal from Don’t Stop Believing.
0:43:46 Uh, yeah, no, it’s, it’s amazing.
0:43:46 Have you heard it?
0:43:47 I have, yes.
0:43:48 It’s like, it’s like a meme.
0:43:49 It’s amazing.
0:43:49 It is.
0:43:55 Uh, what’s one thing you’ve learned about yourself from a wearable device?
0:43:56 Mm.
0:44:03 Uh, one of the things that I, I would say like my ability to understand, recognize when I’ve
0:44:08 actually had a, a poor night’s sleep or a good night’s sleep has gotten much better over
0:44:09 time.
0:44:14 Like I think as humans, we’re not very well calibrated to it, but as you actually start
0:44:20 to wear them and get, understand you can, you are, you become a better self-reporter actually.
0:44:21 I sleep badly.
0:44:24 I assume it’s because I’m middle age.
0:44:29 Uh, I do most of the things you’re supposed to do, but give me one tip for sleeping well.
0:44:31 I get to sleep, but then I wake up in the middle of the night.
0:44:37 Yeah, that, uh, I think one of the things that a lot of people will do, um, is they’ll
0:44:41 worry, um, particularly in bed or use this as a time for thinking.
0:44:48 So a lot of, a lot of the effective strategies surrounding that are to try to actually give
0:44:49 yourself that same time.
0:44:54 That would be that unstructured time that you would be dedicated that you might experience
0:44:54 in bed.
0:44:58 If you tell me I should worry at 10 at night instead of three in the morning, if I worry,
0:45:02 if I say at 10 at night, okay, worry now that I’ll sleep through the night.
0:45:06 I, there, there’s literally evidence surrounding scheduling your worries out and during the
0:45:08 day and, and it does work.
0:45:10 So yeah, that’s, if it’s got some worry.
0:45:14 I’m going to worry at 10 tonight and I’ll let you know tomorrow morning if it works.
0:45:15 Just don’t do it in bed.
0:45:15 Yeah.
0:45:17 Okay.
0:45:18 Okay.
0:45:25 Um, if you had to build a chat bot based on one of the following fictional therapists or
0:45:30 psychiatrists, which fictional therapist or psychiatrists would it be?
0:45:35 A, Jennifer Melfi from The Sopranos.
0:45:38 B, Dr. Krokowski from The Magic Mountain.
0:45:40 C, Frasier from Frasier.
0:45:41 Oh.
0:45:43 Or D, Hannibal Lecter.
0:45:44 Oh God.
0:45:44 Okay.
0:45:46 I would probably go with Frasier.
0:45:52 Uh, very different style of therapy than, but I think his demeanor is at least generally
0:45:52 decent.
0:45:57 Um, so yeah, and mostly appropriate with most of his clients from what I remember in the
0:45:57 show.
0:45:58 Okay.
0:46:01 It’s a very thoughtful response to an absurd question.
0:46:05 Um, anything else we should talk about?
0:46:07 You’ve asked wonderful questions.
0:46:13 Uh, one thing I will say maybe for, for folks that might be listening is a lot of folks are
0:46:17 already using generative AI for their mental health treatment.
0:46:17 Uh-huh.
0:46:24 And so my, I will, I’ll give a recommendation if folks are doing this already, that they just
0:46:28 treat it with the same level of concern they would have the internet.
0:46:31 Um, they may, there may be benefits they can get out of it.
0:46:32 Awesome.
0:46:33 Great.
0:46:38 Um, but just don’t work on changing something within your daily life surrounding particularly
0:46:44 your behavior, um, based on what these models are doing without some real thought on making
0:46:48 sure that that is actually going to be a safe, safe thing for you to do.
0:46:59 Nick Jacobson is an assistant professor at the Center for Technology and Behavioral Health
0:47:02 at the Geisel School of Medicine at Dartmouth.
0:47:05 Today’s show was produced by Gabriel Hunter Chang.
0:47:10 It was edited by Lydia Jean Cott and engineered by Sarah Brugier.
0:47:14 You can email us at problem at pushkin.fm.
0:47:18 I’m Jacob Goldstein, and we’ll be back next week with another episode of What’s Your Problem?
0:47:32 Elon Musk, Doge, and Donald Trump are weaving a web of technological corruption.
0:47:37 Suddenly, the eyes of the industry are open to things that had been obvious to lots of other
0:47:38 people for months.
0:47:43 Isn’t it a conflict of interest that the president of the United States who regulates crypto has
0:47:43 his own coin?
0:47:50 I’m Lizzie O’Leary, the host of What Next TBD, Slate’s podcast about tech, power, and the
0:47:50 future.
0:47:57 What Next TBD covers the latest on how Silicon Valley is changing our government and our lives.
0:47:59 Listen wherever you get your podcasts.
Nick Jacobson and his team at Dartmouth medical school spent over 100,000 hours trying to build an AI chatbot that can serve as a safe, effective therapist. After a few false starts, they seem to be on to something.
Note: This episode contains references to self harm.
Get early, ad-free access to episodes of What’s Your Problem? by subscribing to Pushkin+ on Apple Podcasts or Pushkin.fm. Pushkin+ subscribers can access ad-free episodes, full audiobooks, exclusive binges, and bonus content for all Pushkin shows.
Subscribe on Apple: apple.co/pushkin
Subscribe on Pushkin: pushkin.com/plus
See omnystudio.com/listener for privacy information.