From latent.space
o1 isn’t a chat model (and that’s the point)
10 17
How to use o1 in anger: Don’t Write Prompts; Write Briefs, Focus on Goals: describe WHAT you want, not HOW you want it, and Know what o1 does and does not do well!
#hackernews #ycombinator #人工智慧
on Jan 12
From latent.space
Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)
1 1
Baseten's Amir Haghighat and Yineng Zhang on DeepSeek V3, quantization, pricing strategies, SG Lang, open source AI, and the three pillars of Mission Critical Inference
7h ago
From latent.space
Latent.Space 2024 Year in Review
0 0
For the 100th episode special, swyx and Alessio talk through the highlights of 2024 in Latent Space
on Jan 1
From latent.space
The 2025 AI Engineering Reading List
0 17
We picked 50 paper/models/blogs across 10 fields in AI Eng: LLMs, Benchmarks, Prompting, RAG, Agents, CodeGen, Vision, Voice, Diffusion, Finetuning. If you're starting from scratch, start here.
on Dec 27
From latent.space
Bolt.new, Flow Engineering for Code Agents, and >$8m ARR in 2 months as a Claude Wrapper
0 0
The Stackblitz and Qodo CEOs dish on building production coding agents, from going viral as the hottest new consumer/low-code agent, to the gnarliest enterprise deployments for code/test agents.
on Dec 4
From latent.space
The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic
0 0
Anthropic recently scored a huge win on OpenAI's turf by achieving SOTA on -their- SWE-Bench Verified benchmark, using an upgraded Claude 3.5 Sonnet. For the first time, they spill the beans.
on Nov 28
From latent.space
How to Run a Paper Club (also: LIVE at NeurIPS 2024!)
0 0
Your ultimate Paper Club Starter Kit, from your friends at the Latent Space Paper Club, where we have now read >100 papers. Also: Announcing Latent Space Paper Club LIVE! at Neurips 2024! Join us!
on Nov 25
From latent.space
OpenAI Realtime API: The Missing Manual
0 0
Everything we learned, and everything we think you need to know, from technical details on 24khz/G.711 audio, RTMP, HLS, WebRTC, to Interruption/VAD, to Cost, Latency, Tool Calls, and Context Mgmt
on Nov 21
From latent.space
Why GPT Wrappers Are Good, Actually
0 0
Introducing the Smiling Curve of AI and why -both- model labs and their wrappers are becoming hilariously rich
on Nov 16
From latent.space
The Most Dangerous Thing An AI Startup Can Do Is Build For Other AI Startups
0 0
How Codeium went from 0 to >$10m in ten months, What enterpriseready.io got wrong. A comprehensive braindump on how to be Enterprise Infra Native!
on Nov 13
From latent.space
0 0
How to manage and engineer truly great AI products, why disagreement makes for great podcasts, iterating your way to a viral hit from a "Talk to Small Corpus" side project
on Oct 25
From latent.space
$2 H100s: How the GPU Bubble Burst
0 0
H100s used to be $8/hr if you could get them. Now there's 7 different resale markets selling them under $2. What happened?
on Oct 18
From latent.space
Language Agents: From Reasoning to Acting
0 0
Shunyu Yao on ReAct, Tree of Thought, CoALA, and the emerging importance of AI-Computer Interfaces, ft. returning guest + guest host Harrison Chase of LangChain/LangGraph!
on Sep 28
From latent.space
The Ultimate Guide to Prompting
0 0
Why DSPy is underrated, how to do few-shots properly, why role based prompting doesn't work, and how to HackAPrompt
on Sep 20
From latent.space
0 0
Our episode on all of OpenAI's new models and 2 new paradigms for inference.
on Sep 14
From latent.space
Emulating Humans with NSFW Chatbots - with Jesse Silver
0 0
Distilling personality into models with open LLMs, using DSPy and mitigating prompt injection, and 2-5xing the income of OnlyFans creators with the most advanced AI ecommerce chatbot we have ever seen
on Sep 14
From latent.space
0 1
NVIDIA, Convai, and Google's Nyla Worker on the brutally efficient drivers of production AI inference - where we've been, and where LLMs are likely to go.
on Sep 4
From latent.space
Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind
0 0
Stealing OpenAI models, why LLM benchmarks are useless for you, how to find value in using AI, and how they poisoned LAION with expired domains
on Aug 29
From latent.space
0 0
How Cosine Genie reached 50% on SWE-Bench Lite, 30% on the full SWE-Bench, and 44% on OpenAI's new SWE-Bench Verified, all state of the art results by the widest ever margin recorded.
on Aug 23
From latent.space
0 0
Why Answer is a PBC, going from QLoRA to QDoRA, updating BERT for 2025, creating FastHTML, predicting the OpenAI governance crisis, and a preview of the future of "Dialogue Engineering".
on Aug 23
From latent.space
Segment Anything 2: Demo-first Model Development
0 4
Don't bother keeping absolutely still: This vision model has memory now! Covering SAM 2 with Nikhila Ravi of Facebook AI Research, and special returning guest host Joseph Nelson of Roboflow
on Aug 8
From latent.space
0 0
Design great AI Products that go beyond "just LLM Wrappers": make AI more present, more practical, and then more powerful.
on Jul 24
From latent.space
Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI
0 0
Llama 2 lead and Llama 3 post-training lead Thomas Scialom of Meta/FAIR, on the Chinchilla trap, why Synthetic Data and RLHF works, and how Llama4's focus on Agents will lead us to Open Source AGI.
on Jul 23
From latent.space
0 0
Mar-Jun 2024 Recap: People are raising doubts about AI Summer. Here's why AI Engineers are the solution.
on Jul 23
From latent.space
The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
0 0
Becoming PaLM2 co-lead at Google Brain, training frontier LLMs entirely from ground up in the wilderness as a startup, and playing the AI research metagame.
on Jul 5
From latent.space
How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit
0 0
On Defensive vs Offensive AI Engineering and the ML First mindset: Presenting our ultimate guide to Hiring AI Engineers (and How to Source Them)!
on Jun 26
From latent.space
How to train a Million Context LLM — with Mark Huang of Gradient.ai
0 0
Scaling Llama3 beyond 1M context window with ~perfect utilization, the difference between ALiBi and RoPE, how to use GPT-4 to create synthetic data for your context extension finetunes, and more!
on May 30
From latent.space
0 0
The authors of TinyStories (and later phi-1) from Microsoft Research explain why Textbooks are All You Need, evaluating and interpreting models with GPT-4, and the future of smaller, better models.
on May 22
From latent.space
0 0
Three perspectives on the most viral fringe of generative AI this year: Simulative AI!
on Apr 27
From latent.space
0 0
Why Google failed to make GPT-3, how Adept is the "most misunderstood company" in AI, why multimodal knowledge work models like Fuyu are the future of AGI, and why Adept is NOT a research lab
on Mar 26
From latent.space
The Unbundling of ChatGPT (Feb 2024 Recap)
0 0
Peak ChatGPT? Also: our usual highest-signal recap of top items for the AI Engineer from Feb 2024!
on Mar 26
From latent.space
Latent Space | swyx | Substack
0 1
The AI Engineer newsletter + Top 10 US Tech podcast. Exploring AI UX, Agents, Devtools, Infra, Open Source Models. See https://latent.space/about for highlights from Chris Lattner, Andrej Karpathy, George Hotz, Simon Willison, Emad Mostaque, et al! Click to read Latent Space, a Substack...
on Dec 25, 2023
From latent.space
We Are Running Out of Low-Background Tokens (Nov 2023 Recap)
0 0
The AI contamination Red Alert, and Consistency Models! And our usual highest-signal recap of top items for the AI Engineer from Nov 2023. Now with 100% less OpenAI drama + 100% more Laundry Buddy!
on Dec 11, 2023
From latent.space
0 0
How yesterday's events reshapes the AI Engineering landscape forever
on Nov 19, 2023
From latent.space
The State of Silicon and the GPU Poors - with Dylan Patel of SemiAnalysis
0 0
Listen now (53 mins) | "That Semianalysis Guy" on the incoming wave of GPU supply, the FLOPS demands of the next generation of AI, and being "Transformer-pilled"
on Nov 17, 2023
From latent.space
The New Kings of Open Source AI (Oct 2023 Recap)
0 0
Mistral is the new open source unicorn in town, top takes from the AI Engineer Summit, and our usual highest-signal recap of top items for the AI Engineer from Oct 2023
on Nov 13, 2023
From latent.space
0 0
Listen now (143 mins) | We summon all friends of the pod, and past and future guests including leaders from Nvidia, Zapier, HumanLoop, Weights and Biases, MultiOn, Guardrails, Bloop.ai, Julius AI to process what happened.
on Nov 8, 2023
From latent.space
The End of Finetuning — with Jeremy Howard of Fast.ai
0 1
Listen now | On learning AI fast and how AI's learn fast, the mission of doing more deep learning with less, inventing ULMFiT and why it's now wrong, and how to play the AI Discords game
on Oct 20, 2023
From latent.space
RAG is a hack - with Jerry Liu from LlamaIndex
0 0
Listen now (68 mins) | How to evaluate RAG, why it's still better than finetuning, and how LlamaIndex evolved from a tree-index builder to the most comprehensive framework to leverage data in LLM applications
on Oct 6, 2023