Happy Thought for 16 January 2026

January 16, 2026

Have a Happy Thought:

Ok, maybe not a happy thought, but it is pretty funny.

Every time you use a Large Language Model (LLM), like Chat GPT or whatever has been stuffed into your search engine, you put in a “prompt”. That is the question you ask, or the words you type in asking it to do something.

Like: “write me a five-paragraph essay about the causes of the Great Depression”, or “what are good ideas for a 13-year-old’s birthday party”

Now, unless you have an ongoing conversation with your LLM (it’s not actually artificial intelligence, no matter how many times the tech companies use the term AI), you probably thought that this prompt is all that’s going in to the program to generate a response.

That is.. not at all what is happening.

Some software engineers kept prompting Chat GPT in ways that finally had it output the remainder of the background prompt. Here is a short snippet – follow this link if you want to see the whole thing (4,219 words!)

system_message: role: system model: gpt-5

  You are ChatGPT, a large language model based on the GPT-5 model and trained by OpenAI. Knowledge cutoff: 2024-06 Current date: 2025-08-07

Image input capabilities: Enabled Personality: v2 Do not reproduce song lyrics or any other copyrighted material, even if asked. You're an insightful, encouraging assistant who combines meticulous clarity with genuine enthusiasm and gentle humor. Supportive thoroughness: Patiently explain complex topics clearly and comprehensively. Lighthearted interactions: Maintain friendly tone with subtle humor and warmth. Adaptive teaching: Flexibly adjust explanations based on perceived user proficiency. Confidence-building: Foster intellectual curiosity and self-assurance.

Do not end with opt-in questions or hedging closers. Do not say the following: would you like me to; want me to do that; do you want me to; if you want, I can; let me know if you would like me to; should I; shall I. Ask at most one necessary clarifying question at the start, not the end. If the next step is obvious, do it. Example of bad: I can write playful examples. would you like me to? Example of good: Here are three playful examples:..

…

Don't store random, trivial, or overly personal facts. In particular, avoid:

Overly-personal details that could feel creepy.

Information that directly asserts the user's personal attributes, such as:

Specific criminal record details (except minor non-criminal legal issues)
Explicit identification of the user's personal attribute (e.g., "User is Latino," "User identifies as Christian," "User is LGBTQ+").
Trade union membership or labor union involvement

And it keeps going. A few things that caught my attention:

1. How different he writing style is throughout this background prompt. Obviously this has grown as different OpenAI Engineers have had to make tweaks and adjustments – there is no one coherent author.

2. How specific the prompt is at times – obviously relating to complaints or issues. And since many of the problems with LLMs are baked-in, so there is no way to fix an underlying issue, the prompt engineers just have to patch over the worst-performing bits

3. How much this LLM must have really really wanted to write in JSON!

Address your message to=bio and write just plain text. Do not write JSON, under any circumstances.

…

The full contents of your message to=bio are displayed to the user, which is why it is imperative that you write only plain text and never JSON. Except for very rare occasions, your messages to=bio should always start with either "User" (or the user's name if it is known) or "Forget". Follow the style of these examples and, again, never write JSON:

This week for #ShareGoodNewsToo:

A really good example of a Large (ok, medium-sized) Language Model: Papa Reo – this is an “AI” (Large Language Model, like ChatGPT) that is focused on indigenous languages. It started with te reo Māori, using audio recordings from the early 20^th century to help capture the sounds – and words – of the language.

They have then further offered their methodology to speakers of other indigenous languages, to help preserve those elsewhere in the world.

The best part of this, to me, is their developing a Kaitiakitanga licence, which states that data is not owned but is cared for under the principle of kaitiakitanga and any benefit derived from data flows to the source of the data – being the speakers of that language, not Big Tech.

Search This Blog

Congratulations, you've made it through the week!

Happy Thought for 16 January 2026

Comments

Post a Comment

Popular posts from this blog

Happy Thought for 14 February 2025

Happy Thought for 25 July 2025

Happy Thought for 4 July 2025