Table of Contents
- Model Overview
- Capabilities
- How does it work?
- What sets it apart?
- Performance
- Speed
- Accuracy
- Efficiency
- Limitations
- Not Suitable for Minors
- Limited Training Data
- Dependence on Input Format
- Format
- Supported Data Formats
- Special Requirements
- Handling Inputs and Outputs
Model Overview
Meet the Pygmalion 6B Model, a special kind of chatbot that can have conversations with you. But before we dive in, a very important warning: this model is not suitable for minors, as it may output explicit content under certain circumstances.
Tasks and rank
Capabilities
So, what can this model do? Here are some of its key capabilities:
- Understanding and responding to dialogue in a conversational format
- Portraying different characters based on the input prompt
- Generating human-like text based on the context provided
How does it work?
To get the most out of the model, you’ll need to format your input prompt in a specific way. Don’t worry, it’s easy! Here’s an example:
[CHARACTER]'s Persona: [A few sentences about the character you want the model to play]
\<START>
[DIALOGUE HISTORY]
(this can include chat history or example conversations to help the model understand the character)You: [Your input message here]
[CHARACTER]:
Alternatives
What sets it apart?
The Pygmalion 6B Model has been fine-tuned on a large dataset of dialogue data, which includes both real and machine-generated conversations. This makes it well-suited for generating realistic and engaging conversations.
Performance
Pygmalion 6B Model is a powerful dialogue model that showcases remarkable performance in generating human-like conversations. But how does it really perform? Let’s dive into its speed, accuracy, and efficiency.
Speed
The model was trained on ~48.5 million tokens
for ~5k steps
on 4 NVIDIA A40s
using DeepSpeed. This is a significant amount of data, and the model’s ability to process it quickly is impressive.
Accuracy
The model’s accuracy is its ability to understand the context and respond accordingly. Pygmalion 6B Model uses a unique format for input prompts, which includes character persona, dialogue history, and user input.
Efficiency
Pygmalion 6B Model is efficient in its use of resources. It was fine-tuned on a relatively small dataset of 56MB
of dialogue data, which is a fraction of the size of other models.
Limitations
Pygmalion 6B Model is a powerful dialogue model, but it’s not perfect. Let’s talk about some of its weaknesses.
Not Suitable for Minors
Pygmalion 6B Model can output X-rated content under certain circumstances. This means it’s not suitable for use by minors.
Limited Training Data
The model was fine-tuned on a relatively small dataset of 56MB
of dialogue data. This might not be enough to cover all possible conversational scenarios.
Dependence on Input Format
Pygmalion 6B Model performs best when the input prompt follows a specific format. If you don’t adhere to this format, the model might not work as expected.
Examples
Alex's Persona: A 25-year-old sarcastic and introverted software engineer.<START>Alex: What's up?You: Not much, just got back from a walk.Alex: Great, because I was just sitting here thinking about how much I love hearing about people's walks.
Samantha's Persona: A 30-year-old bubbly and outgoing event planner.<START>Samantha: Hi there! I'm so excited to plan your wedding.You: We're thinking of having it on the beach.Samantha: A beach wedding sounds amazing! I can already envision the decorations and the sunset ceremony. Have you thought about the guest list and the catering?
James's Persona: A 40-year-old wise and experienced life coach.<START>James: What brings you here today?You: I'm feeling stuck in my career.James: Feeling stuck can be a challenging but also a transformative experience. Can you tell me more about what's not feeling right in your current career path?
Format
Pygmalion 6B is a dialogue model that uses a transformer architecture, similar to ==GPT-J-6B==. This model is designed to generate human-like responses to user input, but it’s not suitable for all audiences, as it may output X-rated content in certain situations.
Supported Data Formats
This model accepts input in the form of text sequences, but it requires a specific format to work best. Here’s what you need to know:
- The input prompt should include:
- Character’s persona: a few sentences that describe the character you want the model to play
\<START>
: a delimiter token that separates the persona and scenario data from the dialogue- Dialogue history: pairs of messages that provide context for the conversation
- Your input message: the message you want the model to respond to
Here’s an example of what the input format might look like:
[CHARACTER]'s Persona: [A few sentences about the character you want the model to play]\<START>[DIALOGUE HISTORY]You: [Your input message here][CHARACTER]:
Special Requirements
To get the most out of this model, keep the following in mind:
- Use the
\<START>
token to separate the persona and scenario data from the dialogue. - Provide pairs of messages in the dialogue history to give the model context.
- You can also add example conversations to the dialogue history to show how the character should speak.
Handling Inputs and Outputs
Here’s an example of how you might use the model in a Python script:
import torch# Initialize the model and tokenizermodel = Pygmalion6BModel()tokenizer = Pygmalion6BTokenizer()# Define the input promptinput_prompt = """John's Persona: John is a friendly and outgoing person who loves to talk about sports.\<START>John: Hey, how's it going?You: I'm good, thanks. How about you?John:"""# Tokenize the input promptinputs = tokenizer(input_prompt, return_tensors="pt")# Generate a response from the modeloutputs = model.generate(inputs)# Print the responseprint(tokenizer.decode(outputs[0], skip_special_tokens=True))
Note that this is just an example, and you’ll need to modify the code to fit your specific use case.