“It’s really important that we’re not ceding everything to AI and that we continue to add value ourselves in that collaboration.”
– Claire Mason
About Claire Mason
Claire Mason is Principal Research Scientist at Australia’s government research agency CSIRO, where she leads the Technology and Work team and the Skills project within the organization’s Collaborative Intelligence Future Science Platform. Her team investigates the workforce impacts of Artificial Intelligence and the skills workers will need to effectively use collaborative AI tools. Her research has been published in a range of prominent journals including Nature Human Behavior and PLOS One, and extensively covered in the popular media.
Google Scholar Page: Claire M. Mason
LinkedIn: Claire Mason
CSIRO Profile: Dr. Claire Mason
What you will learn
- Exploring collaborative intelligence with AI and humans
- Leveraging AI’s strengths and human expertise
- Enhancing medical diagnosis with sage patient management system
- Utilizing drones for faster rescue operations
- Essential skills for effective AI collaboration
- Productivity gains from generative AI in various industries
- Future research directions in AI and human teamwork
Episode Resources
Transcript
Ross Dawson: Claire, wonderful to have you on the show.
Claire Mason: Thank you, Ross. Lovely to be here.
Ross: So you are researching collaborative intelligence at CSIRO. So perhaps we would quickly say what CSIRO is. And also, what is collaborative intelligence?
Claire: Thank you. Well, the CSIRO stands for Commonwealth Scientific, Industrial and Research Organization. But more simply, it is Australia’s National Science Agency. We exist to support government objectives around social good and environmental protection, but also to support growth of industry without science. And so we have researchers working in a wide range of fields generally organized around challenges. And one of the key areas we’ve been looking at, of course, is artificial intelligence. It’s been called a general purpose technology, because its range of applications is so vast, and it is so potentially transformative at least.
And collaborative intelligence is about a specific way of working with artificial intelligence. So it’s about considering the AI almost as another member of a team or a partner in your work. Because up till now, most artificial intelligence applications have been about automating a specific task that was formerly performed by a human. But artificial intelligence has developed to the point where it is capable of seeing what we see and conversing with us in a natural way. And adapting to different types of tasks. And that makes it possible for it to collaborate with us to understand the objective that we’re working on, communicate about how the state of the objective or even be aware of how the human state is changing over time, and thereby producing an outcome that you can’t break down to the bit that the AI did and the human did. It’s truly a joint outcome. And we believe that has the potential to deliver a step change in performance.
Ross: Completely agree. Yeah, this is definitely high potential stuff. So you have some, you’re doing plenty of research. Some of it’s been published, some of it’s still yet to be published. So perhaps you can give you a couple of examples of what you’re doing either in research or in practice, which can, I suppose, crystallize these ideas?
Claire: Yeah, absolutely. So to begin with, the key element is that we’re trying to utilize the complementary strengths and weaknesses of human and artificial intelligence. So we know artificial intelligence, vastly superior in terms of dealing with very large amounts of data, and being able to sustain attention on very repetitive tasks or ongoing things. So that means that often, it’s very good when you’re dealing with a problem that requires very large amounts of data, or where you need to monitor something fairly continuously, because humans get bored. They are subject to cognitive biases, and social pressures. So that’s one area of strength that the AI has.
But the AI isn’t great at bringing contextual knowledge. It isn’t great at processing information from five different senses simultaneously yet. So it will also fail at common sense tasks that humans can perform and read easily. But it also can’t deal with novel tasks if it hasn’t seen this type of task before. And it hasn’t seen what the correct response is, it can’t respond to it. So it’s also important to have the human in the loop if you like.
So, we actually developed a definition of what represented collaborative intelligence. And our criteria were that it had to be the human and the artificial intelligence communicating in a sustained way over time, on a shared objective to produce the joint outcome. And where there was this capability for the AI to understand changes in that objective, but it was also going to improve performance and the humans work experience, so that quite a lot of tough criteria. And we reviewed all of the academic literature to see whether this concept was actually delivered in reality, and I think we did this study one or two years ago. Now, we only found 16 bit samples at the time, but they spanned a wide range of applications. So sometimes they were virtual. And a great example of that was something called the sage patient management system. And this is a system that’s meant to improve diagnosis, and patient management.
The way the system works is really interesting. The AI doesn’t deliver a diagnosis. Its job is to take the same data that the physician has and to look for any contract indications in that big stream of data that suggests there might be an issue in the diagnosis or the ongoing management of the patient. So its job is only to intervene if it does see something that suggests maybe something’s wrong in that diagnosis or monitoring. And it then communicates with the human to say, have you looked at this? What about that? The idea here is the AI isn’t in charge, but it is adding to the quality of the treatment and diagnosis by making sure the physician hasn’t missed anything.
Another lovely example is a cyber physical thing. So we’re interested in instances where the AI is embedded in a physical object, whether that be a robot or a drone. And in this case, it’s called drone response, and it’s supporting human rescue teams. The way this works is that the human rescue team is given an alert that somebody is in trouble or needs rescuing. It’s hard for a human rescue team to get to a large area and search it quickly. But you can send a pack of drones out to that area. And those drones are streaming imagery back to the human response team, and sending alerts when they think they may have spotted the person. But the drones still don’t have that capability to go, yes, that’s the one we’re looking for. So that’s the job of the human response team. And when that decision is made, the drones can actually call on other drones that are equipped with something like a floatation device, if the patient or the person needing rescuing is in water, they can bring that immediately. And then although the human response team still needs to get there to rescue the person, the drones can’t do that. That human response team can get those drones to move around the person to understand how to get to the site quickly, and what type of support they need to bring to help the person so he can see a rescue operation happening more quickly and more likely to deliver a good response.
Ross: So those great examples and thinking about them, they just wait, why only 16. Think of extrapolates or variations on what you’ve just described, or even just other use cases. There are many other ways in which even quite simple ways, simply in sort of pre-identification of a particular pattern, for example, to alerting humans. I mean, this is fairly common and easy. So what are the stumbling blocks to really build collaborative intelligence?
Claire: Finding a pattern and alerting a human isn’t the sustained interaction, that communication over time where you continue to build on one another’s work. So we’ve been looking for which this capability can be applied in our own science. And what we’ve discovered is two types of applications. And I would say roles for that the arrival of generative AI has totally changed the context and is creating the potential for collaborative intelligence to be used in many domains. So our review is a little bit out of date now, because that’s really been such a game changer. But what we’ve found is that it can make a huge difference in areas of discovery.
So for us, scientific discovery would be things like our national collections, we hold millions of insects and flora and fauna. And we don’t have enough human beings with the expertise to be able to identify when out of all the insects that have been served through. One might represent a new species. So it’s using AI to go through that entire collection and start digitizing it. And then working with the human to determine this one that looks anomalous. Is this potentially something different that we need to study further, we’re using this capability in the genome annotation space where already AI is being used. But unfortunately, it’s being used in a way that’s proliferating errors. Because if an incorrect classification was made originally, because everybody’s drawing upon one another’s work to build the body of knowledge, it’s getting magnified across many studies.
And so the potential here is that rather than just having the AI come up with suggested annotations, it’s working with the human over time, where the human can direct them to go to other studies or other datasets to refine the decisions that are made. So yes, many applications, things like discovering proteins that have special properties that are needed for particular applications. Cybersecurity, it’s really big, because at the moment AI is being used, but it’s actually creating an alert fatigue. To the humans, cybersecurity professionals cannot respond to all the alerts that the AI generates. And so now we’re building AI to monitor the human workload, using cues, such as their eye gaze or blood pressure. But also just how many alerts that human is currently dealing with, to modify that threshold for notifying the human and potentially do more of the work itself, in circumstances where that would make a difference.
Ross: So I think one of the things we think about humans plus AI, in a system is partly as obviously designed the AI and its outputs so that humans can more readily use them or make sense of them or integrate them into their own mental models. And part of it obviously, is the capabilities of the humans to use the output well, whatever is generated by the AI. So I understand, you’ve been doing some study into the skills for effective use of generative AI. So I’d love to hear more about that.
Claire: Thank you, Ross. I mean, that’s spot on. We’re also looking at things like how you design workflows, redesigning how things are done, because when you get a new technology, and you’ve just plugged it into existing ways of doing things, you don’t really get the transformative benefits. And a great example of that is in the world of chess, where it was a supercomputer that was built in 1997 IBM’s, Deep Blue, I think it was the one that beat the human world champion. But super computers have improved massively since then. But they’ve been outclassed by hybrid teams, the humans working with this computer. And what’s interesting in that story is it’s not the best chess player, or the fastest supercomputer being brought together, that achieves the best results. It’s the humans that have the skills to collaborate with the AI, they need to be the grandmasters and the AI that’s built to work with the humans that is achieving the best results.
And one of the ways in which we’ve been trying to understand what skills the human needs to collaborate in this longer term way with artificial intelligence, is by using generative AI as a test case. So in those instances, where we have a simple question, and we ask for an answer from the AI, we would not call that collaborative intelligence. But when you’re working with it over time to produce a report, and the human might ask for a structure of the report or some ideas for the report initially, and then suggest some sources that the AI should go to, then we consider it a more form of collaborative intelligence.
And so we’ve been talking to expert users of generative AI across a range of fields. And they were nominated by their peers. And we’ve asked them what characteristics, whether skills, knowledge, mindsets, help, will make the difference in effective use of these tools, as opposed to so-so use of these tools. And the feedback was, in some ways, what you would expect. So one of the key things they talk about is understanding the strengths and weaknesses of the tool, what it’s good at, and what types of tasks you might use code for. And when you might use GPT-4, for instead, for example, but they also talked about the need for the human to still have domain expertise, to be able to understand what you’re looking for, and how to evaluate the output of the AI and how to improve upon that. And also, I guess, a responsible mindset where, as one person put it, I’m not considering the AI as a teammate so much as an intern, because it’s my job to guide them. And ultimately, I’m responsible for the quality of the work.
So it’s really important, we’re not ceding everything to the AI. And we continue to add value ourselves in that collaboration. And then they talked about having specific AI communication skills. So they talked about some people using AI, the generative AI like Google, and just plugging in a Google search type of prompt, but that you need to understand it’s conversational, and that you can speak a natural language and you can improve upon your existing request. If it hasn’t responded to that you can get different types of output with different prompts.
And then I think the last piece that came through really clearly was the importance of a learning orientation that the people who were using these tools well weren’t just adopting an existing way of doing things or exploring what else they could do with this capability, how they could do things better, and they were investing time and learning how to use it well, rather than some people who they described, you know, trying at once not getting the result they wanted. And therefore concluding it’s not that useful that you need to keep seeing what it can do, because that changes over time. And then thinking about well, then how do I use that now to improve to deliver something better than I was delivering before?
Ross: So these role users reported skills and behaviors and mindsets?
Claire: Yes, so we just went in with a general question of, can you talk about how humans make a difference in getting really good output from a generative AI tool, versus ordinary output from a generative AI tool? What skills, knowledge, attitudes and mindsets do you think are important, and is it really all clustered under those things about being informed about how to use the AI, being a responsible user, having the knowledge to direct it and evaluate it’s our port? Understanding how to communicate in a way that aligns with the affordances of the AI. And those are changing all the time, because now we have multimodal generative AI. And consequently, being a great lineup, constantly exploring what else is out there and how you can do it still better?
Ross: Well, couple a couple of thoughts, I suppose next steps here. One is those for having empirical studies to look at the actual behaviors rather than self-reported. But the other thing is, well, if we, if these are indeed what makes somebody an effective user of generative AI? How do we then propagate this or educate people or shift their attitudes? I mean, is this something you’re how old going back to the industry point? Is this an industry? How does this flow through into assisting organizations be more effective?
Claire: So there have been about four really good randomized control trials with generative AI, where half of the workers are randomly given the access to the tool and half are not. And those studies have been in Boston Consulting Group with consultants. And with programmers, they’ve got people doing writing tasks, and they’ve also done call center workers. All of those studies found really significant productivity improvements so that the call center workers could resolve 14% more issues that the developers were completing about 15% more pull requests, that the consultants were completing 12% more tasks and achieving what was evaluated 40% higher quality with the customer service ones, I love this one, because not only did they get more productive, but worker turnover reduced. And the customers were expressing more positive sentiment in their communication with the agents. So it’s got those joint benefits that we’re really looking for, which makes the workers happier. And we’re getting productivity and quality benefits from the use of AI.
Ross: So the interesting point, though, is in the Boston Consulting Group study, they did separate between those who were given the tools without any training, and those that were given some training. And whilst there was really just a very small, overall incremental improvement were from the people who had the highest skills training, and in fact, in some cases there was a deficit. So that’s kind of an interesting point as to whether just giving the people the tools or whether there are any educational skills, which can further improve the outputs.
Claire: Something that makes you consistently in this work is that it tends to be the less experienced and the lowest skilled workers who are benefiting most from the use of these tools. And the other thing that’s really important, and which gets to this question about do people need special skills — is that even though performance improves, on average, there are usually some tasks or decisions where the use of the AI is actually making the humans decision worse, because as we’ve said, the AI is not very good at tasks it’s not trained to perform. And so in those instances, maybe it provides bad advice. And in consequence, the humans are going to take longer or give a worse answer than they might have working alone. It’s because of those instances. And also that training and understanding how these tools work and when they fail, that we are confident that human skills are still going to be really important when we work with these tools. And that we are actually developing and trailing some interventions within our own organizations where we’re giving people information about the strengths and weaknesses of the tools, how they work, how they’re trained.
And then we’re trialing that against another intervention, which is about promoting a mindful sort of metacognitive approach when you work with these tools. And the reason we think that’s really important is because, as you know, we all have cognitive heuristics, ways in which we are able to make decisions under conditions of uncertainty. And those rules were primarily developed for working with other humans. So for example, if somebody speaks well on a certain topic, I infer that they will be good at another task, which is related. That inference does not work when you’re dealing with generative AI, which can sound fantastic and can talk, perform brilliantly on some tasks, and then completely fall over on another task that to us would seem simple.
So what we’re arguing is that when we’re working with the AI in this collaborative way, there is nothing that’s relatively routine and automatic. And so our cognitive heuristics become less functional, our role is actually to look for the things that don’t fit the rules. And to be more aware of where the AI might fall down, and when it might be wrong. So we’re looking at training people to be more metacognitive. To think about, well, what other information might be missing, what other sources might I go to, to validate what I’m getting out of the AI? And so, we’re interested in whether AI literacy or metacognitive interventions, or the combination is going to deliver better results for humans who are working with these tools?
Ross: Do you have a hypothesis?
Claire: Our hypothesis is that you need the two — that it’s great to know how AI works and where it can go wrong. But unless you’re switching on that awareness and mindful approach to metacognition, you’re not going to utilize that knowledge in how you respond to AI. And I think that will be a real challenge for us. Generative AI is so good at so many things, how do we make sure we stay aware and alert to where things could be better, or it’s made an error.
So it’s funny because it kind of involves slowing down a bit with the generative AI to notice how to improve upon what it’s given you. And maybe we don’t need that on all tasks. But for knowledge work, I do think that’s where humans will add value. It’s when they’ve retained that self awareness, thinking about how they’re thinking and what the AI is doing, that AI at that stage does not have. It’s intelligent, but it isn’t self aware.
Ross: Are there any established approaches to developing meta cognitive strategies? For new and new spaces? Are there new metacognitive strategies that we require?
Claire: Metacognition is very well understood in the educational domain. We know that metacognition helps people to learn more deeply and more quickly. That’s why in educational settings, you’re often asked a lot of questions about something that you’re learning. They’re called metacognitive prompts. So what are the strengths and weaknesses of adopting this approach? Or what era? What alternative sources of information might you consider here? Those are what we call metacognitive prompts. And what is a possibility and we might be looking at is building metacognitive prompts into artificial intelligence tools to encourage human engagement in metacognition.
Ross: That’s a very interesting outcome of that study. I think that’s particularly important. So it’s two pronged with literacy and metacognition. So the AI illiteracy is that goes back to what you were saying before around the basic approaches, what are effective uses for, how we interact with generative AI?
Claire: Yes, but I think this concept of AI literacy has been around for a while. But I think with the proliferation of AI tools, we’re going to need much more sophisticated AI literacy in the first instance, we’re communicating and interacting with it more and more over time. So building communication forwardGgenerative AI or communication with a cobot, which is going to be a very different proposition. It’s going to be necessary, and that hasn’t so much been the focus in the past. And also, the nuance of this, the generative AI literacy and our might need a very different kind of literacy for a say a computer vision application. So yes, I think that’s going to become a whole big area of study in itself. So
Ross: One of the very interesting areas of research or looking into as you’re looking with generative AI or AI can provide enhancements to capabilities in a collaborative system human using generative AI, but also looking to see whether there is or can be an improvement of the capabilities of the human when you take the AI away. I know there have been some other research studies which have in the conditions which they’ve set up anyway, that they give people the tools, and they do better and they take away and they don’t do any better than they did, which I think is a lot around the framing of how you do that study. And I think it’s yeah, it’s a very interesting idea as well. Can we use generative AI in a way that makes us more capable without the use of the generative AI afterwards?
Claire: Absolutely. Is this just lifting us as long as we have access to generative AI? Or can we learn while we’re using it so that when it’s taken away, we’re actually better at the task by ourselves? And other people are saying, Could it make us less smart? Because now we’re not using skills that we previously built up over time and experienced? They’re really important questions, and I suspect there isn’t a simple answer, because it will have to depend on the way in which you’re using the AI. I mean, I’m pretty confident that when I use a language translator, I’m not learning. Because I take the answer, and I plug it in, and I get what I need out of it.
But perhaps, that study of the customer service workers in a call center was really interesting, because they had a system outage. And what they found was that the low skilled workers who’d use the generative AI, continued to perform better. And the theory was that the AI had been useful in helping to highlight the strategies that are best at dealing with this type of problem or this type of customer. And they managed to retain that learning and continue to work better. So I think that’s going to be a really important area of study, especially for teachers, the classrooms and universities, because we know these tools are starting to be integrated in those learning environments, how we make sure not just that people are learning the skills to work with the generative AI, but that as they work with it more and more, they are also still learning that domain expertise that we know makes a difference.
Ross: Are you doing a study on that at the moment, or designing?
Claire: We are designing a study, what we’re struggling with is the choice of tasks. And also the amount of engagement because I guess what comes out of that language translation example is the it probably depends on how much we engage with the material that the generative AI gives us as to whether we learn from working with the generative AI or not, if I take the answer, and I plug it in, I’m probably not processing it and learning from it. Maybe in other contexts, where I’m using its ideas to build upon them, we will see some of that learning happening. So yeah, that is a study we’re doing, we’re going to have people doing about six different tasks. Some people of course, not getting generative AI at all others where they get it some of the time and not other times, and then assessing everyone’s performance at the end when they don’t have access to the generative AI to see whether the people who got access to the generative AI have learned more than those who worked alone.
Ross: Well, very much look forward to seeing the outputs of that. And I guess my frame is that once you reach one end of the spectrum, you can design your Human-AI collaboration so that the AI learns what the human does, and then continues to take over more of that task. On the other hand, you can design systems where every interaction that humans are learning, perhaps explicitly, in some cases around developing or extending capabilities or testing. It is absolutely in the design of the Human-AI collaboration as to whether or not the human learns or unwinds or becomes numb or whatever it may be.
Claire: A fantastic example of that Ross, that I’m not sure I think is an excellent practice, but takes your idea. And that is schools where they’re using a generative AI tool, with students that always removed some of the information in its answer. I’m just not sure that’s a great way of teaching people to work with these tools, because it’s not really allowing you to experience the true affordances of the tool. But it is that notion that maybe we’re going to have to build in ways of designing the AI to ensure we don’t lose our own intelligence and knowledge and continue to grow from it.
Ross: Yeah and I think that this idea that over reliance on AI is the biggest problem we might have. To round out well, what would you, what do you see is I mean, you’ve already discussed some of them, but I mean, what are some of the other most exciting directions for the research and the potential of these approaches.
Claire: So I guess it comes back to those two types of use cases of discovery and monitoring and identifying the areas where we need those strengths of the AI and the strength of the human to get the best possible outcomes. And I think beginning to break down the types of work where you need that combination, we can’t just automate it. And the human can’t deal with the volume of the data or the speed of response that’s required. All the work is currently as Erik Brynjolfsson would say dirty, dull or dangerous. And then I think the real promise will be understanding how we design the work under that collaborative model, and things like how we calibrate trust appropriately. Not too much, not too little. And the answer to that is going to depend on the type of task you’re doing, and the type of AI you’re dealing with. So there is so much work to be done in this space. But really high potential, I think.
Ross: Fantastic! I love all the research in the work you’re doing now, Claire. I’m sure it’s going to have a very important impact. So thanks so much for your time and your insights.
Claire: Thank you for allowing us to share it. A pleasure.
Podcast: Play in new window | Download