• Trends
  • Topics
  • Nodes
Search for keywords, #hashtags, $sites, add a dash to exclude, e.g. -$theonion.com

From latent.space

o1 isn’t a chat model (and that’s the point)

10 17

How to use o1 in anger: Don’t Write Prompts; Write Briefs, Focus on Goals: describe WHAT you want, not HOW you want it, and Know what o1 does and does not do well!

#hackernews #ycombinator #人工智慧

on Jan 12

From latent.space

Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)

1 1

Baseten's Amir Haghighat and Yineng Zhang on DeepSeek V3, quantization, pricing strategies, SG Lang, open source AI, and the three pillars of Mission Critical Inference

7h ago

From latent.space

Latent.Space 2024 Year in Review

0 0

For the 100th episode special, swyx and Alessio talk through the highlights of 2024 in Latent Space

on Jan 1

From latent.space

The 2025 AI Engineering Reading List

0 17

We picked 50 paper/models/blogs across 10 fields in AI Eng: LLMs, Benchmarks, Prompting, RAG, Agents, CodeGen, Vision, Voice, Diffusion, Finetuning. If you're starting from scratch, start here.

on Dec 27

From latent.space

What Ilya Saw

0 0

In 2014, 2016, 2024, and 2023.

on Dec 15

From latent.space

You're all wrong, $2000 ChatGPT Max is coming

0 0

And you will like it

on Dec 7

From latent.space

Bolt.new, Flow Engineering for Code Agents, and >$8m ARR in 2 months as a Claude Wrapper

0 0

The Stackblitz and Qodo CEOs dish on building production coding agents, from going viral as the hottest new consumer/low-code agent, to the gnarliest enterprise deployments for code/test agents.

on Dec 4

From latent.space

The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic

0 0

Anthropic recently scored a huge win on OpenAI's turf by achieving SOTA on -their- SWE-Bench Verified benchmark, using an upgraded Claude 3.5 Sonnet. For the first time, they spill the beans.

on Nov 28

From latent.space

How to Run a Paper Club (also: LIVE at NeurIPS 2024!)

0 0

Your ultimate Paper Club Starter Kit, from your friends at the Latent Space Paper Club, where we have now read >100 papers. Also: Announcing Latent Space Paper Club LIVE! at Neurips 2024! Join us!

on Nov 25

From latent.space

OpenAI Realtime API: The Missing Manual

0 0

Everything we learned, and everything we think you need to know, from technical details on 24khz/G.711 audio, RTMP, HLS, WebRTC, to Interruption/VAD, to Cost, Latency, Tool Calls, and Context Mgmt

on Nov 21

From latent.space

Why GPT Wrappers Are Good, Actually

0 0

Introducing the Smiling Curve of AI and why -both- model labs and their wrappers are becoming hilariously rich

on Nov 16

From latent.space

The Most Dangerous Thing An AI Startup Can Do Is Build For Other AI Startups

0 0

How Codeium went from 0 to >$10m in ten months, What enterpriseready.io got wrong. A comprehensive braindump on how to be Enterprise Infra Native!

on Nov 13

From latent.space

How NotebookLM Was Made

0 0

How to manage and engineer truly great AI products, why disagreement makes for great podcasts, iterating your way to a viral hit from a "Talk to Small Corpus" side project

on Oct 25

From latent.space

$2 H100s: How the GPU Bubble Burst

0 0

H100s used to be $8/hr if you could get them. Now there's 7 different resale markets selling them under $2. What happened?

on Oct 18

From latent.space

Language Agents: From Reasoning to Acting

0 0

Shunyu Yao on ReAct, Tree of Thought, CoALA, and the emerging importance of AI-Computer Interfaces, ft. returning guest + guest host Harrison Chase of LangChain/LangGraph!

on Sep 28

From latent.space

The Ultimate Guide to Prompting

0 0

Why DSPy is underrated, how to do few-shots properly, why role based prompting doesn't work, and how to HackAPrompt

on Sep 20

From latent.space

From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team

0 0

Our episode on all of OpenAI's new models and 2 new paradigms for inference.

on Sep 14

From latent.space

Emulating Humans with NSFW Chatbots - with Jesse Silver

0 0

Distilling personality into models with open LLMs, using DSPy and mitigating prompt injection, and 2-5xing the income of OnlyFans creators with the most advanced AI ecommerce chatbot we have ever seen

on Sep 14

From latent.space

Efficiency is Coming: 3000x Faster, Cheaper, Better AI Inference from Hardware Improvements, Quantization, and Synthetic Data Distillation

0 1

NVIDIA, Convai, and Google's Nyla Worker on the brutally efficient drivers of production AI inference - where we've been, and where LLMs are likely to go.

on Sep 4

From latent.space

Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind

0 0

Stealing OpenAI models, why LLM benchmarks are useless for you, how to find value in using AI, and how they poisoned LAION with expired domains

on Aug 29

From latent.space

Is finetuning GPT4o worth it?

0 0

How Cosine Genie reached 50% on SWE-Bench Lite, 30% on the full SWE-Bench, and 44% on OpenAI's new SWE-Bench Verified, all state of the art results by the widest ever margin recorded.

on Aug 23

From latent.space

AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

0 0

Why Answer is a PBC, going from QLoRA to QDoRA, updating BERT for 2025, creating FastHTML, predicting the OpenAI governance crisis, and a preview of the future of "Dialogue Engineering".

on Aug 23

From latent.space

Segment Anything 2: Demo-first Model Development

0 4

Don't bother keeping absolutely still: This vision model has memory now! Covering SAM 2 with Nikhila Ravi of Facebook AI Research, and special returning guest host Joseph Nelson of Roboflow

on Aug 8

From latent.space

How to Make AI UX Your Moat

0 0

Design great AI Products that go beyond "just LLM Wrappers": make AI more present, more practical, and then more powerful.

on Jul 24

From latent.space

Llama 2, 3 & 4: Synthetic Data, RLHF, Agents on the path to Open Source AGI

0 0

Llama 2 lead and Llama 3 post-training lead Thomas Scialom of Meta/FAIR, on the Chinchilla trap, why Synthetic Data and RLHF works, and how Llama4's focus on Agents will lead us to Open Source AGI.

on Jul 23

From latent.space

The Winds of AI Winter

0 0

Mar-Jun 2024 Recap: People are raising doubts about AI Summer. Here's why AI Engineers are the solution.

on Jul 23

From latent.space

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

0 0

Becoming PaLM2 co-lead at Google Brain, training frontier LLMs entirely from ground up in the wilderness as a startup, and playing the AI research metagame.

on Jul 5

From latent.space

How To Hire AI Engineers — with James Brady & Adam Wiggins of Elicit

0 0

On Defensive vs Offensive AI Engineering and the ML First mindset: Presenting our ultimate guide to Hiring AI Engineers (and How to Source Them)!

on Jun 26

From latent.space

How to train a Million Context LLM — with Mark Huang of Gradient.ai

0 0

Scaling Llama3 beyond 1M context window with ~perfect utilization, the difference between ALiBi and RoPE, how to use GPT-4 to create synthetic data for your context extension finetunes, and more!

on May 30

From latent.space

[Cognitive Revolution] The Tiny Model Revolution with Ronen Eldan and Yuanzhi Li of Microsoft Research

0 0

The authors of TinyStories (and later phi-1) from Microsoft Research explain why Textbooks are All You Need, evaluating and interpreting models with GPT-4, and the future of smaller, better models.

on May 22

From latent.space

WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai

0 0

Three perspectives on the most viral fringe of generative AI this year: Simulative AI!

on Apr 27

From latent.space

Why Google failed to make GPT-3 + why Multimodal Agents are the path to AGI — with David Luan of Adept

0 0

Why Google failed to make GPT-3, how Adept is the "most misunderstood company" in AI, why multimodal knowledge work models like Fuyu are the future of AGI, and why Adept is NOT a research lab

on Mar 26

From latent.space

The Unbundling of ChatGPT (Feb 2024 Recap)

0 0

Peak ChatGPT? Also: our usual highest-signal recap of top items for the AI Engineer from Feb 2024!

on Mar 26

From latent.space

Latent Space | swyx | Substack

0 1

The AI Engineer newsletter + Top 10 US Tech podcast. Exploring AI UX, Agents, Devtools, Infra, Open Source Models. See https://latent.space/about for highlights from Chris Lattner, Andrej Karpathy, George Hotz, Simon Willison, Emad Mostaque, et al! Click to read Latent Space, a Substack...

on Dec 25, 2023

From latent.space

We Are Running Out of Low-Background Tokens (Nov 2023 Recap)

0 0

The AI contamination Red Alert, and Consistency Models! And our usual highest-signal recap of top items for the AI Engineer from Nov 2023. Now with 100% less OpenAI drama + 100% more Laundry Buddy!

on Dec 11, 2023

From latent.space

The End of OpenAI Hegemony

0 0

How yesterday's events reshapes the AI Engineering landscape forever

on Nov 19, 2023

From latent.space

The State of Silicon and the GPU Poors - with Dylan Patel of SemiAnalysis

0 0

Listen now (53 mins) | "That Semianalysis Guy" on the incoming wave of GPU supply, the FLOPS demands of the next generation of AI, and being "Transformer-pilled"

on Nov 17, 2023

From latent.space

The New Kings of Open Source AI (Oct 2023 Recap)

0 0

Mistral is the new open source unicorn in town, top takes from the AI Engineer Summit, and our usual highest-signal recap of top items for the AI Engineer from Oct 2023

on Nov 13, 2023

From latent.space

AGI is Being Achieved Incrementally (OpenAI DevDay w/ Simon Willison, Alex Volkov, Jim Fan, Raza Habib, Shreya Rajpal, Rahul Ligma, et al)

0 0

Listen now (143 mins) | We summon all friends of the pod, and past and future guests including leaders from Nvidia, Zapier, HumanLoop, Weights and Biases, MultiOn, Guardrails, Bloop.ai, Julius AI to process what happened.

on Nov 8, 2023

From latent.space

The End of Finetuning — with Jeremy Howard of Fast.ai

0 1

Listen now | On learning AI fast and how AI's learn fast, the mission of doing more deep learning with less, inventing ULMFiT and why it's now wrong, and how to play the AI Discords game

on Oct 20, 2023

From latent.space

RAG is a hack - with Jerry Liu from LlamaIndex

0 0

Listen now (68 mins) | How to evaluate RAG, why it's still better than finetuning, and how LlamaIndex evolved from a tree-index builder to the most comprehensive framework to leverage data in LLM applications

on Oct 6, 2023