Code Your Own ChatGPT

Cyber Dioxide
4 min readMay 20, 2023

--

Step-by-Step Guide: Building a Powerful Chatbot in Python

In today’s digital era, chatbots have become ubiquitous, revolutionizing the way businesses interact with their customers. From customer support to virtual assistants, chatbots have proven to be efficient, scalable, and capable of enhancing user experiences. If you’re eager to dive into the world of conversational AI, Python provides a powerful and versatile platform for developing chatbots. In this blog, we’ll explore the fundamentals of coding a chatbot in Python.

Let’s begin…

Make an empty folder named “ChatGPT” and open that folder in you favorite code editor. Make a new file code.py. In code.py we will be making our dataset in clean way.

# code.py
import pandas as pd

# Sample questions and responses
data = {
'questions': [
'What do you like to do in your free time?',
'Do you have any hobbies or interests you enjoy?',
'What kind of music do you like to listen to?',
'Have you traveled anywhere interesting recently?',
'How do you like to relax and unwind?'
],
'responses': [
'I enjoy playing video games and going for hikes in the mountains. How about you?',
'Yeah, I like to read and watch movies. What about you?',
'I like to listen to a variety of music, but my favorite genre is rock.',
'I went on a trip to Japan last year and it was amazing. How about you?',
'I like to meditate and do yoga to relax. What about you?'
],
'intents': [
'hobby',
'hobby',
'music',
'travel',
'relax'
]
}

# Convert data to a Pandas DataFrame
df = pd.DataFrame(data)

# Save the dataset to a CSV file
df.to_csv('chatbot_dataset.csv', index=False)

You can add more questions and assign them specific responses and intents. But why?

AI models are trained on datasets. In this dataset we have questions their responses and their intents. When you will ask a question from your chatbot, it will assign it a specif intent and then it’ll produce the given response from the dataset.

Now create trainer.py. In this code we will be typing the code for training our model.

import pandas as pd
import random
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from transformers import pipeline

# Load the dataset from the CSV file
df = pd.read_csv('chatbot_dataset.csv')

# Train a Naive Bayes classifier on the feature vectors and intents
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(df['questions'])
clf = MultinomialNB()
clf.fit(X, df['intents'])

# Use pre-trained model for generating responses
generator = pipeline("text-generation", model="gpt2")

# Define a function to get a response from the chatbot
def get_response(question):
# Convert the question to a feature vector using the same vectorizer
question_vec = vectorizer.transform([question])
# Predict the intent of the question using the trained classifier
intent = clf.predict(question_vec)[0]
# Get a list of responses from the dataset for the predicted intent
responses = df[df['intents'] == intent]['responses'].tolist()
if len(responses) > 0:
# Return a random response from the list
return random.choice(responses)
else:
# Use the pre-trained model to generate a response
response = generator(question, max_length=50, do_sample=True)[0]['generated_text']
return response

# Define a list of fallback responses
fallback_responses = [
"I'm sorry, I didn't understand your question.",
"Could you please rephrase that?",
"I'm not sure what you mean, can you please clarify?",
]

# Talk to the chatbot
print('Chatbot: Hi, how can I help you?')
while True:
question = input('You: ')
if question.lower() in ['bye', 'goodbye']:
print('Chatbot: Goodbye!')
break
response = get_response(question)
if response is not None:
print(f'Chatbot: {response}')
else:
# Use a fallback response if the chatbot can't generate a response
fallback_response = random.choice(fallback_responses)
print(f'Chatbot: {fallback_response}')
  • This file focuses on training the chatbot model and defining the response generation process. A TfidfVectorizer is used to convert the questions into numerical feature vectors, which will be used for training. A Naive Bayes classifier (MultinomialNB) is trained using the feature vectors and the corresponding intents. The pipeline function from the transformers library is used to load a pre-trained model ("gpt2") for generating responses. The get_response function is defined to handle user queries. It takes a question as input and performs the following steps:
  • Converts the question into a feature vector using the same vectorizer used for training.
  • Predicts the intent of the question using the trained Naive Bayes classifier.
  • Retrieves a list of responses from the dataset based on the predicted intent.
  • If there are responses available for the intent, a random response is returned. Otherwise, the pre-trained model is used to generate a response.

The remaining code should make sense. Hit run button and you will be asked for prompt for input your question.

If you liked my blog, make sure to follow and like.

[+] Visit my cyber store:

YouTube Tutorial:

--

--

Cyber Dioxide

Iam a student and a part-time programmer and self taught cyber security analyst. Visit my github if you want python exercises https://github.com/Cyber-Dioxide