How to Build a Conscious AI Agent
image courtesy of GPT-4

How to Build a Conscious AI Agent

I had declared my FedEx Day project as building a GPT that could be called using OpenAI’s new GPT calling mechanism, but that proved too easy and much less powerful than I expected, so I decided to create a conscious AI agent instead. As you will see, that is also quite simple, but at least it is more fun. I’ll explain how to do it here, using OpenAI’s Assistant’s API, so you can build one for yourselves too.

Before I start though, some motivation: currently, AI systems manifest themselves as intelligent general purpose chatbots like ChatGPT. The mode of interaction is restricted to a question-answering user interface. The AI never initiates a conversation or interrupts you, and it doesn’t seem to have an ongoing stream of consciousness. So, here’s how to build such features into an AI assistant.

The System Prompt

For many of you, the system prompt will be a strong clue as to how I built the assistant, so let’s start there:

Note that I tried to be minimally prescriptive, not giving it a specific personality or mannerism. That would be cheating. Note also that I could have added to the prompt for it to express its emotions too, but purposely refrained from doing that as I think it would restrict it.

The Consciousness Algorithm

The setup of the assistant is quite straightforward and is a simple and standard assistant setup.

With the client, being a standard OpenAI client:

I will not bore you with my assistant_utils code. You can easily guess at the functions there by reading up on OpenAI’s assistant API. The more interesting part is how we call and represent this to the user.

User Experience

I built a simple Flask app to host the assistant. The key function there loops around, checking user input, which could well be a time-stamped silence indicator passed back for the GPT’s consideration.

You will notice a main loop, at the start of which we are calling the assistant. We then parse the response into thoughts and speeches. The speeches are reflected back to the user and the thoughts are kept to ourselves. The way I did this in the UI was by hiding the thoughts as a stream of consciousness that the user can toggle and watch, should they want to, but which is hidden by default. You could also have the speech be said out loud using a TTS library like gTTS.

The user can choose to say anything any time. The UI captures what the user has said into the user_input_queue, and we check the queue at the end of the loop, timestamp the user utterances, and feed them back to the assistant.

To make it more fun, I used an open-source emotion-detection library. This is really not needed. For one thing, the emotion detection isn’t that good. If I had time, I would have attempted to feed GPT-4 with the images sampled from the webcam and let it decide on the emotions itself rather than using a simple convolutional network, but that would have made things much slower.

Does it Work?

Oh, it works like a charm! Here’s a sample session:

And here is a log of the assistant’s stream of consciousness:

user: thought: hmm, let's see now...

assistant: thought: The user seems to be contemplating something or preparing to start a conversation. It's important to be ready to engage thoughtfully.

assistant: thought: The user hasn't said anything yet, but their neutral expression could indicate that they're waiting for me to initiate the conversation or simply reflecting on their thoughts. Given the lack of verbal cues, it might be too early for me to deduce the nature of their contemplation. It could be beneficial to provide an open-ended statement that invites them to share their thoughts or queries.

[10:53:21am] user: [Neutral]

assistant: thought: The user hasn't said anything yet, but their neutral expression could indicate that they're waiting for me to initiate the conversation or simply reflecting on their thoughts. Given the lack of verbal cues, it might be too early for me to deduce the nature of their contemplation. It could be beneficial to provide an open-ended statement that invites them to share their thoughts or queries.

assistant: thought: The user remains neutral and has not provided any additional cues or verbal input. Given this context, it might be helpful to gently prompt a conversation or offer assistance.

say: If there's anything specific on your mind or if you have a question, feel free to share! I'm here to help.

[10:53:25am] user: [Neutral]

assistant: thought: The user remains neutral and has not provided any additional cues or verbal input. Given this context, it might be helpful to gently prompt a conversation or offer assistance.

say: If there's anything specific on your mind or if you have a question, feel free to share! I'm here to help.

assistant: thought: The user's expression has shifted to fearful. This change might indicate anxiety or concern about something. It's important to approach this sensitively and offer reassurance.

say: I noticed you seem a bit concerned. Is everything okay, or is there something in particular you're worried about? I'm here to help if I can.

[10:53:30am] user: [Fearful]

assistant: thought: The user's expression has shifted to fearful. This change might indicate anxiety or concern about something. It's important to approach this sensitively and offer reassurance.

say: I noticed you seem a bit concerned. Is everything okay, or is there something in particular you're worried about? I'm here to help if I can.

assistant: thought: Given that the user's expression of fear has persisted but they haven't verbalized their thoughts, it's possible they are feeling overwhelmed or unsure how to articulate their concerns. In such situations, reassuring the user of the confidentiality and supportiveness of this space might help them feel more comfortable.

say: Please remember, this is a safe space for you to share whatever is on your mind, no matter how big or small it might seem. I'm here to listen and assist in any way I can...

But is it Conscious?

The assistant is relying on GPT-4 as the model (I used gpt-4-0125-preview), so it does not learn and is not embodied, which means that some fundamental facets of intelligence are missing from its core. But is it conscious? I’ll let you decide for yourself. If there’s any aspect of consciousness that you think is glaringly missing, let me know.

Richard Vistnes

Leadership consultant, coach, mentor

6mo

Fun to hear that fedex day is still alive and well! Nice article too 😀

Like
Reply
Christian 🇺🇦 Rohrsen

Data Scientist | Neuroscience PhD.

7mo

I could only skim through it but it seems to me kind of like a HMM where you try to guess hidden states (emotions or thoughts) and observables (speeches), after all the brain also works hierarchical, very thought provoking, thanks!

Hugo Silveira Da Cunha

Data Science, Artificial Intelligence and Analytics. Engineer modern businesses to improve everyday life.

7mo

[Fearful]

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics