Overview
April Flow as a developer platform
April Flow is a sovereign AI platform for documents and data. Developers can use it from backend services, scripts, workers, command-line tools, and internal applications.
The SDKs expose typed APIs for collections, uploads, documents, prompts, identity, billing, support messages, and notifications. The scraper project demonstrates how a Python tool can collect content and push it into April Flow.
Typical integration model
Application, worker, script, or CLI
-> April Flow SDK
-> Collections
-> Uploads
-> Documents
-> Prompt sessions
-> Notifications
-> Your assistant, scraper, automation, or product workflow
Java SDK
Java SDK for server-side applications, backend services, batch jobs, and JVM-based systems.
Python SDK
Python SDK for scripts, CLI tools, automation, scraping, data processing, and backend services.
Scraper CLI
A Python scraper project using the April Flow Python SDK to collect and upload content.
Repository
Repository layout
The repository groups the SDKs and tools in one place so developers can browse both language implementations and the scraper example.
aprilflow-sdk
aprilflow-java-sdk
Java SDK
Maven project
Java 17+
Jersey HTTP client
Jackson JSON serialization
SSE notification support
aprilflow-python-sdk
Python SDK
Python 3.10+
httpx HTTP client
pydantic models
notification stream support
aprilflow-scraper
Python scraper CLI
Uses aprilflow-python-sdk
Uses Scrapy and Playwright
Provides collection, scrape, upload, and shell commands
Designed around extendable scraping profiles
Requirements
Requirements
Java SDK
- Java 17 or newer
- Maven
- April Flow base URL
- April Flow user key
Python SDK
- Python 3.10 or newer
- httpx
- pydantic
- April Flow base URL
- April Flow user key
Keep user keys server-side
April Flow user keys should be stored on trusted backend systems only. Do not embed them in browser JavaScript, mobile apps, public repositories, or downloadable frontend bundles.
Installation
Install the SDKs
Use the Java SDK from Maven, or install it locally while developing. Use the Python SDK in editable mode for local development.
<dependency>
<groupId>com.aprilsoftware</groupId>
<artifactId>aprilflow-java-sdk</artifactId>
<version>1.0</version>
</dependency>
cd aprilflow-java-sdk
mvn clean install
cd aprilflow-python-sdk
pip install -e .
pip install -e '.[notifications]'
pip install -e '.[test]'
Client
Create a client
Both SDKs authenticate with an April Flow user key. The client obtains access tokens and uses bearer authentication for API calls.
import aprilflow.sdk.AprilFlowClient;
AprilFlowClient client;
client = AprilFlowClient.create(
"https://api.aprilflow.ai",
System.getenv("APRILFLOW_USER_KEY")
);
System.out.println(client.prompt().builtWith());
AprilFlowClient client;
client = AprilFlowClient.builder()
.baseUrl("https://api.aprilflow.ai")
.userKey(System.getenv("APRILFLOW_USER_KEY"))
.build();
import os
from aprilflow import AprilFlowClient
client = AprilFlowClient.create(
base_url="https://api.aprilflow.ai",
user_key=os.environ["APRILFLOW_USER_KEY"],
)
print(client.prompt.built_with())
from aprilflow import AprilFlowClient
client = AprilFlowClient(
base_url="https://api.aprilflow.ai",
user_key="YOUR_USER_KEY",
)
client.close()
Collections
Work with collections
Collections group uploaded content and can be used by prompts. A collection can represent a knowledge base, customer workspace, website, dataset, or product documentation set.
import aprilflow.sdk.collection.Collection;
import aprilflow.sdk.collection.CollectionVisibility;
import aprilflow.sdk.collection.CreateCollectionRequest;
Collection collection;
collection = client.collection().create(
CreateCollectionRequest.create()
.name("SDK example collection")
.description("Collection created from the Java SDK")
.visibility(CollectionVisibility.Private)
);
System.out.println(collection.getId());
from aprilflow import CollectionVisibility
collection = client.collection.create(
name="SDK example collection",
description="Collection created from the Python SDK",
visibility=CollectionVisibility.PRIVATE,
)
print(collection.id)
Uploads
Upload documents and wait for processing
The upload API supports bytes, files, and streams. Both SDKs also provide wait helpers that subscribe to notifications and return once processing reaches a terminal status.
import aprilflow.sdk.upload.Upload;
import aprilflow.sdk.upload.UploadFileRequest;
import aprilflow.sdk.upload.UploadWaitOptions;
import java.nio.file.Path;
Upload upload;
upload = client.upload().uploadFileAndWait(
UploadFileRequest.create()
.collectionId(collection.getId())
.file(Path.of("/path/to/document.pdf"))
.fileName("document.pdf")
.contentType("application/pdf"),
UploadWaitOptions.defaultOptions()
);
System.out.println(upload.getStatus());
from pathlib import Path
from aprilflow import UploadWaitOptions, UploadStatus
upload = client.upload.upload_file_and_wait(
collection_id=collection.id,
file=Path("/path/to/document.pdf"),
file_name="document.pdf",
content_type="application/pdf",
wait_options=UploadWaitOptions(
timeout=300,
terminal_statuses=[
UploadStatus.PROCESSED,
UploadStatus.IGNORED,
UploadStatus.CANCELED,
UploadStatus.ON_ERROR,
UploadStatus.DOCUMENT_DELETED,
UploadStatus.QUOTA_EXCEEDED,
],
),
)
print(upload.status)
Documents
Search and retrieve documents
The document API provides search, batch retrieval, object retrieval, original document operations, and deletion.
import aprilflow.sdk.document.DocumentItem;
import aprilflow.sdk.document.DocumentSearchRequest;
java.util.List<DocumentItem> documents;
documents = client.document().search(
DocumentSearchRequest.create()
.collectionId(collection.getId())
.text("search text")
.maxResult(10)
);
from aprilflow import DocumentSearchRequest
documents = client.document.search(
DocumentSearchRequest(
collection_id=collection.id,
text="search text",
max_result=10,
)
)
Prompts
Create prompt sessions
Prompt sessions let applications keep conversational context. They can run with plain text, selected collections, or selected uploads.
import aprilflow.sdk.prompt.CreateSessionResult;
import aprilflow.sdk.prompt.PromptRequest;
import aprilflow.sdk.prompt.PromptWaitOptions;
CreateSessionResult result;
result = client.prompt().session().createAndWait(
PromptRequest.create()
.text("Summarize the uploaded content.")
.collectionIds(java.util.List.of(collection.getId())),
PromptWaitOptions.defaultOptions()
);
System.out.println(result.getPrompt().getOutput());
from aprilflow import PromptRequest, PromptWaitOptions, PromptStatus
result = client.prompt.session.create_and_wait(
PromptRequest(
text="Summarize the uploaded content.",
collection_ids=[collection.id],
),
wait_options=PromptWaitOptions(
timeout=300,
terminal_statuses=[
PromptStatus.COMPLETED,
PromptStatus.INTERRUPTED,
PromptStatus.ON_ERROR,
PromptStatus.QUOTA_EXCEEDED,
],
),
)
print(result.prompt.output)
Notifications
Listen for asynchronous processing
Upload and prompt processing can be followed through notification streams. Most applications should use the higher-level wait helpers, but direct subscriptions are available when a workflow needs events.
import aprilflow.sdk.notification.NotificationSubscription;
NotificationSubscription subscription;
subscription = client.upload().watch(
collection.getId(),
upload.getId(),
notification -> {
System.out.println(notification.getAction());
System.out.println(notification.getObjectType());
System.out.println(notification.getObject());
},
error -> {
error.printStackTrace();
}
);
subscription.close();
from aprilflow import Notification
def on_notification(notification: Notification):
print(notification.action, notification.object_type, notification.object)
def on_error(error: Exception):
print("notification stream error", error)
subscription = client.notification.listen(
["collection.upload", "prompt"],
on_notification,
on_error,
)
subscription.close()
Scraper
Scraper CLI example
The repository includes a Python scraper CLI that uses the April Flow Python SDK for API calls. It is built around extendable scraping profiles, so developers can add new sources while reusing the same collection selection, run management, upload, and shell workflow.
Scraper workflow
- Install the Python SDK and scraper project.
- Set the April Flow base URL and user key.
- List or select the target collection.
- Run a scraping profile.
- Review the generated run directory.
- Upload the scraped run directory into April Flow.
- Use prompt sessions to ask questions over the uploaded content.
Scraper profiles
Extend the scraper with profiles
The scraper is designed around profiles. A profile defines how a specific source should be discovered, crawled, parsed, normalized, and written to a run directory before upload.
This makes the scraper extendable. Developers can add their own profiles for public websites, documentation portals, regulatory sources, internal knowledge bases, or customer-specific sources while reusing the same command-line workflow and April Flow upload pipeline.
Profile responsibility
- Define the source to scrape
- Discover pages, records, or source URLs
- Fetch content using Scrapy, Playwright, or source-specific logic
- Extract relevant page content
- Normalize title, text, URL, date, and metadata
- Write files into a structured run directory
Shared scraper pipeline
- Runs profiles from the CLI
- Keeps run outputs organized
- Supports collection selection
- Uploads scraped files to April Flow
- Supports interactive shell usage
- Can be used manually, in cron jobs, or in automation workers
Conceptual profile structure
aprilflow-scraper
profiles
cssf
Discover source pages
Extract page content
Normalize documents
Write run output
your_profile
Discover your source
Extract your content
Normalize your documents
Reuse the same upload command
Why profiles matter
Scraping logic is usually source-specific, but upload, collection selection, run management, shell commands, logging, and April Flow API calls should not be rewritten for every source. Profiles keep source-specific extraction isolated while the rest of the scraper remains reusable.
This makes the scraper useful both as an example and as a starting point for real integrations. A team can keep several profiles in the same project, each targeting a different website, documentation space, data provider, or internal content system.
Install
pip install -e ../aprilflow-python-sdk
cd aprilflow-scraper
pip install -e .
playwright install chromium
playwright install-deps chromium
Configure
export APRILFLOW_BASE_URL="https://api.aprilflow.ai"
export APRILFLOW_USER_KEY="..."
export SCRAPER_DATA_DIR="$HOME/scraper"
Command line
scraper collections list
scraper collections get ab12cd34
scraper collections use ab12cd34
scraper scrape cssf --limit 20
scraper upload <run_dir> --into ab12cd34 --concurrency 2
Interactive shell
scraper shell
collections list
use ab12cd34
scrape cssf
upload ~/scraper/runs/2026-02-11_174642_cssf --concurrency 2
quit
Production
Production integration notes
Security
Store user keys in secrets management, environment variables, or protected server configuration. Rotate keys when needed.
Processing
Use wait helpers for simple workers. Use notification subscriptions when you need to react to asynchronous processing events.
Data boundaries
Design collections around clear boundaries such as product, customer, workspace, domain, or retention policy.
Observability
Log upload IDs, prompt IDs, collection IDs, and notification events so workers can be monitored and retried safely.
GitHub
GitHub repository
Use the GitHub repository to browse the Java SDK, Python SDK, scraper CLI, tests, and examples.