What is LlamaIndex?
LlamaIndex is a data framework designed for connecting custom data to large language models. Where LangChain is broad, LlamaIndex goes deep on one thing: getting your own data into LLMs effectively. It handles document loading, text splitting, indexing, embedding, and querying across formats like PDFs, databases, APIs, and more.
It's the go-to tool for building RAG (retrieval-augmented generation) systems. You load your documents, LlamaIndex creates vector indexes, and then you can query those indexes with natural language. The LLM calls for embedding and answer generation are where EUrouter comes in: set api_base to EUrouter, and that inference happens in Europe.
Quick to integrate
A few lines of code is all it takes. Swap your base URL and you are routed through EUrouter.
1from llama_index.llms.openai import OpenAI
2
3llm = OpenAI(
4 model="gpt-4o",
5 api_key="eur-...",
6 api_base="https://api.eurouter.ai/v1",
7)
8
9response = llm.complete("Explain GDPR Article 44")
10print(response)Get started in minutes
Follow these steps to connect your application to EUrouter.
Install LlamaIndex
Install LlamaIndex with the OpenAI integration.
1pip install llama-index llama-index-llms-openaiConfigure the LLM
Set api_base to EUrouter when creating the LLM instance.
1from llama_index.llms.openai import OpenAI
2
3llm = OpenAI(
4 model="gpt-4o",
5 api_key="eur-...",
6 api_base="https://api.eurouter.ai/v1",
7)Build your index
Create indexes and query engines as usual. All LLM calls route through EUrouter.
1from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
2
3documents = SimpleDirectoryReader("data").load_data()
4index = VectorStoreIndex.from_documents(documents, llm=llm)
5query_engine = index.as_query_engine()
6response = query_engine.query("Summarize the key points")Why use LlamaIndex with EUrouter
EU-compliant data pipelines
LlamaIndex is built for connecting private data to LLMs. That often means sensitive documents: contracts, HR records, customer data. With EUrouter, the LLM inference for embedding and querying stays in Europe, so your entire RAG pipeline can be GDPR-compliant.
- All LLM calls for embedding and answer generation routed through EU infrastructure
- Embeddings API available through EUrouter for document indexing
- Pair with an EU-hosted vector store (pgvector, Qdrant, Weaviate) for end-to-end compliance
- Suitable for processing sensitive documents where data residency is a legal requirement
Multi-model indexing
LlamaIndex pipelines typically involve two types of model calls: embeddings (high volume, lower cost) and generation (lower volume, higher cost). Through EUrouter, you can use a cost-effective embedding model for indexing thousands of documents, then switch to a more capable model for answer generation.
- Use affordable embedding models for indexing large document collections
- Switch to GPT-4o or Claude for final answer synthesis
- EUrouter's smart routing can optimize model selection for cost and latency
- One API key covers embedding and generation across all providers
Production-ready knowledge bases
When your LlamaIndex knowledge base goes to production, reliability matters. EUrouter adds automatic failover between model providers: if one provider is down or slow, your queries get rerouted to another. Your users don't notice a thing.
- Automatic failover between providers keeps your knowledge base online
- Per-key spend limits prevent runaway costs from high-volume indexing jobs
- Real-time monitoring shows latency and cost for every LLM call in your pipeline
- No single point of failure for your production RAG application