> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cognee.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# S3 Storage

> Step-by-step guide to using S3 for data ingestion and storage

A minimal guide to using S3 (or S3-compatible, e.g., MinIO) to ingest data and/or store Cognee's internal files.

**Before you start:**

* Complete [Quickstart](getting-started/quickstart) to understand basic operations
* Ensure you have [LLM Providers](setup-configuration/llm-providers) configured
* Have S3 credentials and access to an S3 bucket

## What S3 Storage Does

* **Ingest from S3**: Pass `s3://...` paths to `cognee.add()` to load data directly from S3
* **Store Cognee data on S3**: Set your data/system roots to S3 URLs to keep all files on S3
* **S3-compatible**: Works with MinIO and other S3-compatible services

## Prerequisites

Install with AWS extra if needed (boto3/s3fs) and add credentials to `.env`:

```dotenv theme={null}
aws_access_key_id=your_access_key
aws_secret_access_key=your_secret_key
aws_region=us-east-1
# Optional for S3-compatible endpoints (e.g., MinIO):
aws_endpoint_url=http://localhost:9000
```

## Option A: Ingest from S3

Pass S3 URIs (files or prefixes) directly to `remember()`. Directories/prefixes expand to files when credentials are set.

```python theme={null}
import asyncio
import cognee


async def main():
    # Single file: ingest and build the graph in one call
    await cognee.remember(
        "s3://cognee-s3-small-test/Natural_language_processing.txt",
        dataset_name="s3_single_demo",
        self_improvement=False,
    )

    # Folder/prefix (recursively expands)
    await cognee.remember(
        "s3://cognee-s3-small-test",
        dataset_name="s3_prefix_demo",
        self_improvement=False,
    )

    # Mixed list
    await cognee.remember(
        [
            "s3://cognee-s3-small-test/Natural_language_processing.txt",
            "Some inline text to ingest",
        ],
        dataset_name="s3_mixed_demo",
        self_improvement=False,
    )

if __name__ == "__main__":
    asyncio.run(main())
```

<Accordion title="Legacy guide">
  ```python theme={null}
  import asyncio
  import cognee

  async def main():

      # Single file
      await cognee.add("s3://my-bucket/docs/paper.pdf")

      # Folder/prefix (recursively expands)
      await cognee.add("s3://my-bucket/datasets/reports/")

      # Mixed list
      await cognee.add([
          "s3://my-bucket/docs/paper.pdf",
          "Some inline text to ingest",
      ])

      # Process the data
      await cognee.cognify()

  if __name__ == "__main__":
      asyncio.run(main())
  ```
</Accordion>

This loads data directly from S3 using the `s3://` URI. `remember()` expands prefixes, reads the S3 objects, and builds retrieval-ready memory for each target dataset.

<Note>
  This simple example uses S3 paths for demonstration. In practice, you can mix S3 files with local files, use dataset scoping, and apply custom loaders. The same `remember()` flow works with S3 paths.
</Note>

## Option B: Store Cognee Data on S3

Keep Cognee's generated files (text copies, system files) on S3 by pointing roots to S3 URLs.

Add this to your `.env`:

```dotenv theme={null}
DATA_ROOT_DIRECTORY="s3://my-bucket/cognee/data"
SYSTEM_ROOT_DIRECTORY="s3://my-bucket/cognee/system"
# Optional: force S3 backend detection
STORAGE_BACKEND="s3"
```

This configures Cognee to store all its internal files (processed data, system files) on S3 instead of locally.

<Info>
  Cognee chooses S3 storage when roots start with `s3://` (or when `STORAGE_BACKEND=s3` and both roots are S3 URLs). Credentials from `.env` are required.
</Info>

<Columns cols={3}>
  <Card title="Core Concepts" icon="brain" href="/core-concepts/overview">
    Understand knowledge graph fundamentals
  </Card>

  <Card title="Setup Configuration" icon="settings" href="/setup-configuration/overview">
    Configure providers and databases
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/introduction">
    Explore API endpoints
  </Card>
</Columns>
