What is Weave?

W&B Weave is an observability and evaluation platform for building reliable LLM applications. Weave helps you understand what your AI application is doing, measure how well it performs, and systematically improve it over time.

Why Weave?

Building LLM applications is fundamentally different from traditional software development. LLM outputs are non-deterministic, making debugging harder. Quality is subjective and context-dependent. Small prompt changes can cause unexpected behavior changes. Traditional testing approaches fall short. Weave addresses these challenges by providing:

Visibility into every LLM call, input, and output in your application
Systematic evaluation to measure performance against curated test cases
Version tracking for prompts, models, and data so you can understand what changed
Feedback collection to capture human judgments and production signals

What you can do with Weave

Debug with traces

Weave automatically traces your LLM calls and shows them in an interactive UI. You can see exactly what went into each call, what came out, how long it took, and how calls relate to each other. Get started with tracing

Evaluate systematically

Run your application against curated test datasets and measure performance with scoring functions. Track how changes to prompts or models affect quality over time. Build an evaluation pipeline

Version everything

Weave tracks versions of your prompts, datasets, and model configurations. When something breaks, you can see exactly what changed. When something works, you can reproduce it. Learn about versioning

Collect feedback

Capture human feedback, annotations, and corrections from production use. Use this data to build better test cases and improve your application. Collect feedback

Monitor production

Score production traffic with the same scorers you use in evaluation. Set up guardrails to catch issues before they reach users. Set up guardrails and monitors

How Weave fits in your workflow

Weave supports the full LLM application development lifecycle:

Phase	What Weave provides
Build	Trace calls to understand behavior, debug issues, and iterate quickly
Test	Evaluate against datasets with custom and built-in scorers
Deploy	Version prompts and models for reproducible deployments
Monitor	Score production traffic, collect feedback, catch regressions

Supported languages

Weave provides SDKs for Python and TypeScript:

Python
TypeScript

pip install weave

npm install weave

Both SDKs support tracing, evaluation, datasets, and the core Weave features. Some advanced features like class-based Models and Scorers are currently Python-only.

Integrations

Weave integrates with popular LLM providers and frameworks:

LLM providers: OpenAI, Anthropic, Google, Mistral, Cohere, and more
Frameworks: LangChain, LlamaIndex, DSPy, CrewAI, and more
Local models: Ollama, vLLM, and other local inference servers

When you use a supported integration, Weave automatically traces LLM calls without additional code changes. View all integrations

Next steps

Quickstart

Trace your first LLM call in 5 minutes

Build an evaluation

Set up systematic testing for your application

Explore integrations

Connect Weave to your LLM provider

Join the community

Get help and share what you’re building

Get Started

Concepts

Guides

Tutorials

Reference

Details & Support

Open Source

Community

Why Weave?

What you can do with Weave

Debug with traces

Evaluate systematically

Version everything

Collect feedback

Monitor production

How Weave fits in your workflow

Supported languages

Integrations

Next steps

Quickstart

Build an evaluation

Explore integrations

Join the community

Get Started

Concepts

Guides

Tutorials

Reference

Details & Support

Open Source

Community

​Why Weave?

​What you can do with Weave

​Debug with traces

​Evaluate systematically

​Version everything

​Collect feedback

​Monitor production

​How Weave fits in your workflow

​Supported languages

​Integrations

​Next steps

Quickstart

Build an evaluation

Explore integrations

Join the community

Why Weave?

What you can do with Weave

Debug with traces

Evaluate systematically

Version everything

Collect feedback

Monitor production

How Weave fits in your workflow

Supported languages

Integrations

Next steps