Compare extraction quality across Schematron 3B, 8B, and Gemini 2.5 Flash
→ Learn how to evaluate and use Schematronhttps://exa.ai
This demo compares three models: Schematron-3B, Schematron-8B, and Gemini 2.5 Flash. It's important to understand that Schematron and Gemini 2.5 Flash are fundamentally different types of models.
Schematron models are purpose-built for structured data extraction. They do not take prompts—instead, they take HTML content and a JSON schema (which can be defined via Zod or Pydantic) and directly output structured data. This design makes them extremely efficient for extraction tasks.
In this demo, we're using Gemini without traditional prompting—we're simply providing it with the JSON schema and HTML, then extracting via strict JSON response format. However, Gemini's accuracy can be significantly improved for specific tasks by adding carefully crafted prompts. There's much more flexibility with general-purpose models like Gemini.
Despite the flexibility advantage of Gemini, you'll see that Schematron handles extraction tasks extremely intelligently and is an order of magnitude faster and cheaper. This is what makes it particularly powerful for large-scale extraction workloads.
The latency shown in this demo represents total round-trip latency from our server (a Vercel Next.js route) to the model provider and back.
Unlike chat applications, we don't care as much about time-to-first-token—what matters is total latency from request to complete response.
We recommend Schematron specifically for large-scale extraction tasks. This is where it truly shines, as its speed and cost advantages unlock use cases that simply aren't economically viable with other models.
💡 The combination of being significantly faster and cheaper means you can:
Our serverless API handles all the prompt templating for you. You only need to worry about providing your HTML content and JSON schema—we take care of the rest.
Easiest to useBoth Schematron models are open source and available to try. However, using them directly requires managing your own prompt templating, which can be tricky to get right.
More controlReady to start using Schematron? Check out these resources to learn more and get started:
💡 According to the official benchmarks, Schematron is 40-80x cheaper than GPT-5 while maintaining frontier-level extraction quality.