Skip to content

Content Ingestion

VaultStream supports multiple methods for ingesting video content. This guide covers the REST API, watch-folder automation, and batch import patterns.

Ingestion Methods

Method Best For Max File Size
REST API Individual uploads, CI/CD 50 GB
Watch Folders Automated batch from network shares Any
S3/GCS Monitor Cloud-native workflows Any
RSS/Atom Feed Content syndication 1,000 items/hr
FTP Drop Zones Legacy system integration 50 GB

REST API

POST /v1/content/upload
Authorization: Bearer <TOKEN>
Content-Type: multipart/form-data

Parameters: file (required), title (required), description, visibility, metadata (JSON), transcode_profile, folder_id.

Python Upload Script

import requests

def upload_video(filepath, title, token, visibility="private"):
    with open(filepath, "rb") as f:
        resp = requests.post(
            "https://api.cyfr.technology/v1/content/upload",
            headers={"Authorization": f"Bearer {token}"},
            files={"file": f},
            data={"title": title, "visibility": visibility}
        )
    resp.raise_for_status()
    return resp.json()["content_id"]

Watch-Folder Automation

Configure a directory for continuous monitoring. Any video file placed there is automatically uploaded and transcoded.

Python Watch-Folder Script (Customizable)

This is the script pattern that most enterprise customers deploy. It includes filename parsing, metadata extraction, duplicate detection, and error handling. Customers are responsible for customizing this script to match their content organization and naming conventions.

import os, time, hashlib, requests
from pathlib import Path
from datetime import datetime

WATCH_DIR = "/mnt/media/ingest"
API_URL = "https://api.cyfr.technology/v1/content/upload"
TOKEN = os.environ["VS_API_TOKEN"]
SEEN = set()

def file_hash(path):
    sha = hashlib.sha256()
    with open(path, "rb") as f:
        for chunk in iter(lambda: f.read(8192), b""):
            sha.update(chunk)
    return sha.hexdigest()

def parse_filename(path):
    """Extract metadata from filenames.

    Supports conventions like:
      - "Department - Title (2026).mp4"
      - "series_s01e03_title.mp4"
      - "CEO_TownHall_2026-07-04.mp4"

    Customers commonly extend this function to match their
    internal naming conventions. See Metadata Providers
    for the full customization API.
    """
    name = Path(path).stem
    parts = name.split(" - ", 1)
    if len(parts) == 2:
        return {"department": parts[0], "title": parts[1]}
    return {"title": name}

def upload(filepath):
    h = file_hash(filepath)
    if h in SEEN:
        return None
    meta = parse_filename(filepath)
    with open(filepath, "rb") as f:
        resp = requests.post(API_URL,
            headers={"Authorization": f"Bearer {TOKEN}"},
            files={"file": f},
            data={
                "title": meta.get("title"),
                "metadata": str({
                    "ingested_at": datetime.utcnow().isoformat(),
                    "source_path": str(filepath),
                    **meta
                })
            })
    resp.raise_for_status()
    SEEN.add(h)
    return resp.json()["content_id"]

def watch():
    while True:
        for ext in ("*.mp4", "*.mkv", "*.mov", "*.webm"):
            for f in Path(WATCH_DIR).rglob(ext):
                try:
                    cid = upload(str(f))
                    if cid:
                        print(f"[{datetime.now()}] {f}{cid}")
                except Exception as e:
                    print(f"Error: {f}: {e}")
        time.sleep(30)

if __name__ == "__main__":
    watch()

Customization Points

The script above is a template. Enterprise customers typically extend it to:

  • Parse custom filename conventions — season/episode markers, department codes, date stamps
  • Integrate with metadata APIs — TMDB, IMDB, internal asset management systems
  • Apply content organization rules — auto-tagging, folder assignment, access control policies
  • Handle post-upload workflows — notification triggers, webhook callbacks, downstream processing

VaultStream provides the ingestion API and transcoding infrastructure. What content you feed into it, how you organize it, and who you grant access to — that is entirely under your control.

Cloud Storage Monitors

S3 and GCS bucket monitoring is configured via the Admin API:

POST /v1/admin/cloud-monitors
{
  "provider": "s3",
  "bucket": "acme-training-videos",
  "region": "us-east-1",
  "prefix": "incoming/",
  "iam_role_arn": "arn:aws:iam::123456789:role/vaultstream-ingest",
  "poll_interval_seconds": 60
}

Feed Import

RSS/Atom feed import for syndicated content:

POST /v1/admin/feed-imports
{
  "name": "Corporate Training Feed",
  "url": "https://internal.corp.com/training/feed.xml",
  "fetch_interval_minutes": 60,
  "field_mapping": {
    "title": "title",
    "description": "summary",
    "media_url": "enclosure.@url"
  }
}

Rate Limits

Tier Daily Limit Max File Concurrent
Starter 10 GB 5 GB 2
Professional 100 GB 50 GB 10
Enterprise Custom Custom Custom