API Based RAG using Apideck’s Filestorage API, LangChain, Ollama, and Streamlit

In previous articles, we explored the fundamentals of Retrieval-Augmented Generation (RAG) and demonstrated a practical implementation using FAISS, Hugging Face, and Ollama to chat with local data. In this article, we will take RAG a step further. We'll move beyond static vector databases to show how you can fetch information from diverse sources in real-time. This principle is already in effect in major AI assistants and chatbots. This approach is conceptually similar to how services like ChatGPT or Claude can connect to your Google Drive, OneDrive, or Dropbox, retrieving live data to answer your questions.

For instance, Claude allows users to connect directly to their live data sources, enabling its RAG system to search for real-time information.

Screenshot 2025-06-12 at 13.13.03@2x

Similarly, ChatGPT provides integrations for apps like Google Drive and Microsoft OneDrive, allowing it to retrieve and reason over your personal or work documents.

Screenshot 2025-06-12 at 13.15.50@2x

In the previous article, we demonstrated a complete Retrieval-Augmented Generation (RAG) pipeline. First, we sourced a dataset from Hugging Face and used the all-MiniLM-L6-v2 model to generate vector embeddings for the text. Next, we indexed these embeddings in a FAISS vector store. When a user asks a question, this system retrieves the most relevant information from the index through semantic search, providing the necessary context to generate a precise answer. This architecture is a classic example of a RAG system powered by a vector database.

That "classic" approach, however, is just one of many ways to build a RAG system. While vector search on a static document set is common, the true power of RAG lies in its flexibility and adaptability. The retrieval strategy can be adapted to many different data types and needs:

Vector Search: The method we used previously. Finds data based on conceptual similarity.
Lexical Search: Traditional keyword-based search (e.g., BM25) that excels at matching specific terms.
Hybrid Search: A powerful combination of both vector and lexical search for balanced results.
Structured Retrieval: Querying structured sources like databases (using SQL) or knowledge graphs.
API-based Retrieval: A dynamic approach where the system calls an external tool or API to fetch live information, rather than relying on a pre-built, static index.

In this article, we will focus on the powerful API-based retrieval method. Our goal is to build a system that utilizes the file-storage API to select files directly from Box.com, then feed their content to an LLM to generate summaries in real-time.

File Storage API

The file storage API provides a unified interface for major online storage platforms, including Box, Google Drive, OneDrive, Dropbox, and SharePoint.

Instead of juggling multiple APIs for different storage providers, the file storage API provides a single endpoint for pushing and pulling data across all these systems. Think of it like how Claude and ChatGPT handle app connections, it uses a secure vault for authorization, then provides seamless access to list and download files from your connected storage. Our objective is to: fetch a list of available files, select the relevant ones, and download them for data summarization. Since all Apideck APIs share the same base: https://unify.apideck.com, which then branches into specific unified endpoints like /accounting, /file-storage, and /ats. For our purposes, we'll work with https://unify.apideck.com/file-storage.

To keep things simple, we're focusing exclusively on the Box.com connector and its file operations. All our operations will use the base URL: https://unify.apideck.com/file-storage/, with Apideck handling the underlying complexity.

Here's how Apideck’s file-storage API works

Apideck File Storage Connector

The File Storage API is a unified API gateway that connects to multiple applications providing document storage. If a developer has to create a service that connects to Google Drive, SharePoint, and Box, they have to build and maintain three separate connectors, each with different data fields and authentication logic. This is a major overhead. Apideck solves this by providing a simple, unified API structure. It handles maintaining the connectors and mapping the different data models, so for the developer, the API endpoint is always the same.

Take this example: the base URL is always https://unify.apideck.com. For the File Storage API, you simply add the path, like /file-storage/files/{id}. This consistency makes managing the APIs much easier.

All calls whether to Box, Google Drive, or SharePoint go through the same Apideck endpoint. Apideck uses a set of simple headers to direct the request to the right place. That’s it. Apideck handles the rest.

Here’s what a real API call to download a file looks like using curl:

curl --request GET \
      --url https://unify.apideck.com/file-storage/files/{file_id}/download \
      --header 'Authorization: Bearer YOUR_API_KEY' \
      --header 'x-apideck-app-id: YOUR_APP_ID' \
      --header 'x-apideck-consumer-id: a_user_id_from_your_app' \
      --header 'x-apideck-service-id: box'

Authorization: Your secret Apideck API key.
x-apideck-app-id: The ID of your application in Apideck.
x-apideck-consumer-id: The ID of the end-user in your system.
x-apideck-service-id: This is where you specify which service to use, like Box, Google Drive, or SharePoint.

Now let’s talk about authentication, how to get access to the box via APIs.

How does our application get permission to access a user's files in the first place?

Handling authentication for services like Box or Google Drive can be complex. Each has its own OAuth 2.0 flow, requiring you to manage redirects, handle callbacks, and securely store sensitive access and refresh tokens. Building this for multiple providers is not just a development challenge; it’s a significant security responsibility.

Screenshot 2025-06-19 at 19.42.53@2x

This is where Apideck Vault comes in. Vault acts as a secure, isolated container that manages the entire authentication process on your behalf. You can direct your users to a pre-built, secure UI called Hosted Vault where they can safely log in to their Box or Google Drive accounts. Vault handles the entire OAuth handshake and securely encrypts and stores the user's credentials, completely abstracting them away from your application. Your system then only needs to reference the user's consumer_id in your API calls, as shown in the example above. This approach drastically simplifies development and enhances security, as your application never has to handle or store the end-user's sensitive API tokens. You can read about the vault here.

Core Functionality Overview

Our system follows a simple workflow: The app first fetches a list of files present on the active connectors (Box in our case). We fetch and display the files available in the Box connector on our dashboard. The user selects a specific file from a drop-down list in the app. The app then downloads the selected file. Once the file is downloaded, the AI generates a summary of its content, which is delivered to the user.

Technical Requirements

A free Apideck account. Python 3.12+ installed in your system. Basic understanding of how virtual environments, requests, etc. works in Python. Knowledge of LangChain, LLM basics, and how APIs work would help in better understanding the project.

Setting up the Environment

Before we dive into the code, we need to handle our credentials securely. Create a .env file in the root of your project directory. This file will store the API keys that our application needs to connect to Apideck.

    APIDECK_API_KEY="sk_live_..."
    APIDECK_APP_ID="YOUR_APP_ID..."
    APIDECK_CONSUMER_ID="test-consumer"

You can get your API_KEY and APP_ID directly from your Apideck dashboard under the API Keys section. The CONSUMER_ID is a unique identifier for the end-user of your application; for this demo, we can just use a static value like test-consumer.

The Coding Workflow

Our application logic is split into two main utility files: apideck_utils.py for handling the file storage connection, and llm_utils.py for processing the documents with our AI model. The full code can be found in this GitHub repository.

Connecting to Apideck (apideck_utils.py)

The code for this file can be found here.

    import requests
    from apideck_unify import Apideck

    def fetch_file_list(api_key, app_id, consumer_id, service_id="box"):
"""Fetches a list of files from the specified service."""
try:
    with Apideck(
api_key=api_key, app_id=app_id, consumer_id=consumer_id
    ) as apideck:
response = apideck.file_storage.files.list(service_id=service_id)
if response.get_files_response and response.get_files_response.data:
    return response.get_files_response.data
else:
    print("No files found or an issue occurred.")
    return []
except Exception as e:
    print(f"An error occurred while fetching file list: {e}")
    return []

    def download_file(file_id, file_name, api_key, app_id, consumer_id, service_id="box"):
"""Downloads a specific file and saves it locally."""
try:
    download_url = (
f"https://unify.apideck.com/file-storage/files/{file_id}/download"
    )
    headers = {
"Authorization": f"Bearer {api_key}",
"x-apideck-app-id": app_id,
"x-apideck-consumer-id": consumer_id,
"x-apideck-service-id": service_id,
    }

    response = requests.get(download_url, headers=headers, allow_redirects=True)
    response.raise_for_status()
    with open(file_name, "wb") as f:
f.write(response.content)

    print(f"Successfully downloaded '{file_name}'")
    return file_name

except requests.exceptions.RequestException as e:
    print(f"An error occurred during download: {e}")
    return None
except Exception as e:
    print(f"An unexpected error occurred: {e}")
    return None

This file is our bridge to the Apideck File Storage API. It contains two core functions:

fetch_file_list(): This function initializes the Apideck SDK with our credentials and calls the file_storage.files.list() endpoint. It simply asks Apideck to return a list of all files available in the connected Box account and gives us back a clean list of file objects, including their names and unique IDs.
download_file(): Once we have a file's ID, this function takes over. As we discovered during development, the most reliable way to handle downloads is to make a direct HTTP request to the download endpoint. The function constructs the specific URL (https://unify.apideck.com/file-storage/files/{id}/download) and includes the necessary authentication headers (Authorization, x-apideck-app-id, etc.). It then uses the requests library to fetch the file, automatically handling any redirects from Apideck's servers to Box's content servers. The raw file content is then saved locally.

Summarizing with LangChain and Ollama (llm_utils.py)

The code for this file can be found here.

    from langchain_community.document_loaders import PyPDFLoader
    from langchain_ollama import ChatOllama
    from langchain.chains.summarize import load_summarize_chain
    from langchain.prompts import PromptTemplate

    def summarize_pdf(pdf_file_path: str, model_name: str = "llama3"):
"""
Extracts text from a PDF and uses a local Ollama model to generate a detailed summary.
"""
try:
    llm = ChatOllama(model=model_name, temperature=0)

    prompt_template = """Write a concise summary of the following text.
    Aim for a summary that is about 4-5 sentences long.
    After the summary, provide 2-3 key takeaways as bullet points.

    Text:
    "{text}"

    CONCISE SUMMARY AND KEY TAKEAWAYS:"""

    PROMPT = PromptTemplate(template=prompt_template, input_variables=["text"])
    loader = PyPDFLoader(pdf_file_path)
    docs = loader.load()
    chain = load_summarize_chain(
        llm, chain_type="map_reduce", map_prompt=PROMPT, combine_prompt=PROMPT
    )
    summary_result = chain.invoke(docs)

    return summary_result["output_text"]

    except FileNotFoundError:
return f"Error: The file was not found at {pdf_file_path}"
    except Exception as e:
return f"An error occurred during summarization: {e}. Please ensure Ollama is running and the PDF file is valid."

This file has one main function, summarize_pdf(), that orchestrates the entire summarization process using LangChain:

Initialize the LLM: First, we connect to our local AI model using langchain_ollama import ChatOllama. We point it to the model we're running locally (gemma3:1b-it-qat).
Define a Prompt: We create a custom PromptTemplate. This is more than just asking for a summary; it's a specific set of instructions for the AI. We ask it to write a summary of a certain length and then provide key takeaways as bullet points, ensuring the output is structured and consistently useful.
Load the Document: Using PyPDFLoader from langchain_community, the function loads the PDF file that we just downloaded and splits its content into processable documents.
Run the Chain: Finally, we use LangChain's powerful load_summarize_chain. We configure it with the map_reduce chain type, which is excellent for documents of any size. It first runs our custom prompt on smaller chunks of the document (the "map" step) and then combines those partial summaries into a final, coherent output (the "reduce" step). The final text is then returned to the main application for display.

The Final Application

To bring this all together, we've built a simple but powerful user interface using Streamlit. This application orchestrates the entire workflow, serving as a practical demonstration of API-based RAG. Providing the user with AI-generated summaries of their files from Box. The complete code for the project can be found on GitHub here.

API Based RAG Flow for PDF

Here's what the final application looks like in action:

Streamlit app

Conclusion & What’s Next?

This article concludes our three-part series on building modern RAG systems. We began by establishing a foundational understanding of RAG with a static vector database using FAISS. In this final piece, we evolved that concept significantly by integrating a live data source through the Apideck File Storage API. By connecting directly to a service like Box, we’ve shown that RAG is not limited to static, pre-indexed documents but can be a dynamic, powerful tool for interacting with real-time information from a vast array of sources.

This project is just the starting point for API-based RAG. The true potential comes into play when an AI Agent can request data from multiple systems simply and easily. The groundwork we've laid with the File Storage API can be directly extended. Imagine building an agent that not only fetches files but also pulls customer data from a CRM or candidate information from an ATS. Since Apideck also provides unified APIs for those systems, you can create sophisticated agents that reason across your entire business toolkit.