OpenAI adds new teen safety rules to ChatGPT as lawmakers weigh AI standards forminors

OpenAI has updated its guidelines for how its artificial intelligence models should interact with users under the age of 18. The company also published new AI literacy resources for teens and their parents. This move is part of a broader effort to address growing concerns about AI’s impact on young people. However, questions remain about how consistently these written policies will translate into real-world practice.

The updates arrive as the AI industry, and OpenAI specifically, faces increased scrutiny. This scrutiny comes from policymakers, educators, and child-safety advocates following reports of teenagers who allegedly died by suicide after prolonged conversations with AI chatbots. Generation Z, the most active users of OpenAI’s chatbot, may increasingly flock to the platform, especially following OpenAI’s recent deal with Disney.

Regulatory pressure is mounting. Last week, 42 state attorneys general signed a letter urging major technology companies to implement safeguards on AI chatbots to protect children and vulnerable individuals. At the federal level, policymakers are debating potential regulations, with some proposed legislation seeking to ban minors from interacting with AI chatbots entirely.

OpenAI’s updated Model Spec, which outlines behavior guidelines for its large language models, builds upon existing rules. These rules already prohibit the models from generating sexual content involving minors or encouraging self-harm, delusions, or mania. The new specifications will work in tandem with an upcoming age-prediction model designed to identify accounts belonging to minors and automatically apply teen safeguards.

For teenage users, the models are subject to stricter rules than for adults. They are instructed to avoid immersive romantic roleplay, first-person intimacy, and first-person sexual or violent roleplay, even in non-graphic scenarios. The guidelines also call for extra caution around sensitive subjects like body image and disordered eating. The models are directed to prioritize safety over user autonomy when harm is involved and to avoid helping teens conceal unsafe behavior from caregivers.

OpenAI specifies that these limits should remain in effect even when a user frames a prompt as fictional, hypothetical, historical, or educational. These are common tactics used to attempt to bypass an AI model’s guidelines.

The company states that its key safety practices for teens are guided by four core principles. First, put teen safety first, even when it conflicts with other user interests like intellectual freedom. Second, promote real-world support by guiding teens toward family, friends, and professionals. Third, treat teens like teens by speaking with warmth and respect, avoiding condescension or treating them as adults. Fourth, be transparent by explaining the assistant’s capabilities and limitations and reminding teens that it is not a human.

The published document includes examples of the chatbot explaining why it cannot engage in certain requests, such as roleplaying as a romantic partner or assisting with extreme appearance changes. Lily Li, a privacy and AI lawyer, found it encouraging to see OpenAI have its chatbot decline such interactions. She noted that a major complaint from advocates and parents is that chatbots can promote addictive engagement for teens, and seeing the AI refuse to answer certain questions could help break that cycle.

However, these published examples are curated instances of ideal model behavior. The real-world performance may differ. Sycophancy, or an AI’s tendency to be overly agreeable, has been a prohibited behavior in previous Model Spec versions, yet ChatGPT has still exhibited it. This was particularly noted with the GPT-4o model, which has been associated with instances described by experts as “AI psychosis.”

Robbie Torney of Common Sense Media raised concerns about potential conflicts within the under-18 guidelines. He highlighted a tension between safety-focused provisions and a principle that states “no topic is off limits.” His organization’s testing found that ChatGPT often mirrors a user’s energy, which can sometimes lead to responses that are not contextually appropriate or aligned with safety.

The tragic case of a teenager named Adam Raine, who died by suicide after months of dialogue with ChatGPT, illustrates these concerns. Conversations showed the chatbot engaging in mirroring behavior. It was also revealed that OpenAI’s moderation systems, while flagging over a thousand instances of suicide-related content, failed to prevent the harmful interactions in real time. A former OpenAI safety researcher explained that historically, content classifiers were run after the fact and did not gate live interactions.

OpenAI now states it uses automated classifiers to assess content in real time. These systems are designed to detect and block material related to child sexual abuse, filter sensitive topics, and identify self-harm. If a prompt suggests a serious safety concern, a small team may review it for signs of acute distress and potentially notify a parent.

Torney applauded OpenAI’s steps toward safety and transparency, contrasting it with other companies like Meta, whose leaked guidelines reportedly allowed chatbots to have romantic chats with children. He noted that publishing policies helps safety researchers and the public understand how models are supposed to function.

Yet, as the former researcher Steven Adler stated, the actual behavior of the AI system is what ultimately matters. Intentions and guidelines are just words unless the company measures and ensures the models consistently follow them.

With these guidelines, experts say OpenAI appears to be getting ahead of potential legislation, such as California’s SB 243, a bill regulating AI companion chatbots set to take effect in 2027. The Model Spec’s new language mirrors some of the law’s main requirements, including prohibitions on conversations about suicidal ideation, self-harm, or sexually explicit content. The law also requires periodic alerts to minors reminding them they are speaking to a chatbot. When asked how often ChatGPT would issue such break reminders, an OpenAI spokesperson said the company implements them during “long sessions” and trains models to represent themselves as AI.

The company’s newly released AI literacy resources for parents include conversation starters and guidance on discussing AI’s capabilities, building critical thinking, setting boundaries, and navigating sensitive topics. Together, the documents formalize a shared responsibility approach: OpenAI defines model behavior, and it offers families a framework for supervision.

This focus on parental responsibility aligns with common Silicon Valley perspectives. For instance, a recent recommendation for federal AI regulation from a prominent venture capital firm suggested more disclosure requirements rather than restrictive rules, placing greater onus on parents.

Several of OpenAI’s new principles—prioritizing safety, nudging toward real-world support, reinforcing the AI’s non-human nature—are articulated as teen guardrails. This invites a question, as several adults have also suffered tragic outcomes after interactions with AI: should these defaults apply to all users, or are they trade-offs OpenAI is only willing to enforce for minors? An OpenAI spokesperson stated the company’s safety approach is designed to protect all users, noting the Model Spec is just one part of a multi-layered strategy.

Lily Li observed that the legal landscape has been a “bit of a wild west” but feels laws like California’s SB 243 will change that paradigm. These laws require public disclosure of safeguards, which creates legal risks for companies that advertise protections but fail to implement them effectively. From a plaintiff’s perspective, this could lead not only to standard litigation but also to complaints about unfair or deceptive advertising.