AI tl;dr

Made with Jumprun — 🤖 AI-powered research as stunning, interactive canvases. 🎨 Get started for free! 🚀

AI tl;dr

Last updated: 2 months ago

GPT-3.5 vs GPT-4

Metric	GPT-3.5	GPT-4
Uniform Bar Exam	10th percentile	90th percentile
Biology Olympiad	31st percentile	99th percentile
Perplexity	18.3	9.8
Accuracy	83.9%	90.1%
Inference Time	0.12s	0.08s

Safer and More Aligned

OpenAI spent 6 months making GPT-4 safer and more aligned. Compared to GPT-3.5, GPT-4 is:

🔒 82% less likely to respond to requests for disallowed content ✅ 40% more likely to produce factual responses

GPT-4's advanced reasoning capabilities expedited OpenAI's safety work, using it to create training data and iterate on classifiers. Learn more.

Many-Shot Jailbreaking Effectiveness

Anthropic's research shows that as the number of "shots" (faux dialogues) in a prompt increases, large language models become more likely to produce harmful responses related to violent, hateful, deceptive, discriminatory, and regulated content. This "many-shot jailbreaking" technique takes advantage of growing context window sizes.

Responsible Disclosure

Anthropic has briefed other AI developers about the many-shot jailbreaking vulnerability and implemented mitigations on their own systems. They continue researching prompt-based mitigations to address this issue.

The company is publishing this research to help the broader AI community understand and address this new class of vulnerabilities enabled by growing context window sizes in large language models. Learn more.

GPT Performance Trends

The latest GPT models from OpenAI, GPT-4 and GPT-3.5, show significant performance improvements over previous versions across key metrics like perplexity, accuracy, and inference time. 📈

As AI systems continue advancing, we can expect further breakthroughs in areas like reasoning, knowledge, and safety alignment.

New Chief Product Officer

Mike Krieger, co-founder and former CTO of Instagram, has joined Anthropic as the Chief Product Officer. 👋

With deep experience building innovative products and scaling user experiences, Krieger is uniquely suited to take Anthropic's product efforts to new heights as the company continues its rapid growth.

Krieger cites Anthropic's focus on building capable and trustworthy AI systems as a key draw. Learn more.

Anthropic Global Footprint

Anthropic is rapidly expanding its global presence and partnerships:

🇪🇺 Claude, Anthropic's AI assistant, is now available in Europe. 🤝 Anthropic partnered with AWS to build trusted AI solutions for enterprises. 🇰🇷 Anthropic partnered with SK Telecom on AI initiatives in South Korea.