AI/ML Dictionary

👋

On this page, you’ll find technical terms and concepts that you may come across while working with AI chatbots and generative AI. We hope that understanding these terms will help you better navigate the field!

Hallucinations

📕 Definition: Instances where the model generates information that is plausible-sounding but incorrect.

👉 Example: If you ask a chatbot for the capital of a fictional country, it might generate a convincing-sounding answer even though no such country exists.

🤔 Why does it matter?: Recognizing hallucinations is crucial for ensuring the reliability and trustworthiness of AI-generated content. It's important to validate the information provided by AI.

🔗 Source: OpenAI Blog on GPT-3 Hallucinations

Prompt Engineering

📕 Definition: The process of designing and refining the inputs (aka the prompts!) given to an AI model to receive the most accurate and relevant outputs

👉 Example: Asking “Tell me about chess” is too vague and the intent is unclear. Asking “Can you explain the basic rules of chess, including how each piece moves and the objective of the game?” is specific and clearer in intent.

🤔 Why does it matter?: Good prompts improve accuracy and enhance the user experience.

🔗 Source: OpenAI Blog on P

Temperature

📕 Definition: A hyperparameter in a model that controls the randomness of predictions.

👉 Example: When this parameter value is closer to 1, a generative model will generate a more random output. For example, with a high value, given a prompt “I like…,” it might give answers that vary a lot from a range of “I like cats” to “I like to skate in a frozen lake,” whereas with a low value, it might give similar answers such as “I like cats” and “I like dogs.”

🤔 Why does it matter?: You might want to control how random your model’s outputs will be based on the goal of your model. If you want it to give super creative answers, you might want to increase the value closer to 1.

🔗 Source: Affino Blog on Temperature

Overfitting

📕 Definition: A consequence that happens during training, when a model predicts very accurately from the training data but is unable to generalize to the testing data.

👉 Example: An image classifier trained on images of dogs all with a grass background, it may learn to correctly classify dogs with grass or greenery in the background, but may not be able to classify dogs indoors.

🤔 Why does it matter?: If we (unintentionally or intentionally) exclude certain groups from the model’s training data, it will fail to accurately predict for those underrepresented groups, making it an unreliable and biased model.