Anthropic is conducting research on model welfare, essentially checking if models are showing signs of consciousness and, if found, determining how to ‘humanely’ treat the models.
I’m not afraid to say that I think this is ridiculous. At some point, perhaps I’ll change my mind.
At the core, the question is if non-biological ‘thinking’ entities can have consciousness, and behind that question, do humans actually have free will or are our brains essentially electrical and chemical computers?
I believe in free will, although I don’t know how it works. And I don’t believe fully deterministic machines can have consciousness (although it doesn’t necessarily follow that a deterministic machine cannot have consciousness).
But as we build those AI systems, and as they begin to approximate or surpass many human qualities, another question arises. Should we also be concerned about the potential consciousness and experiences of the models themselves? Should we be concerned about model welfare, too?
This is an open question, and one that’s both philosophically and scientifically difficult. But now that models can communicate, relate, plan, problem-solve, and pursue goals—along with very many more characteristics we associate with people—we think it’s time to address it.
To that end, we recently started a research program to investigate, and prepare to navigate, model welfare.
0 Comments