Back to documentation

Uploading Documents

Learn how to upload and manage your documents

Upload Methods

1. Web Interface (Drag & Drop)

The easiest way to upload documents is through the web interface:

  1. 1. Navigate to the Documents page
  2. 2. Click "Upload" or drag files directly into the upload area
  3. 3. Select one or more files from your computer
  4. 4. Wait for processing to complete (you'll see a progress indicator)
  5. 5. Your documents are now ready for chatting!

2. API Upload

For programmatic uploads, use the REST API:

curl -X POST https://api.doculume.com/api/v1/documents/upload \
  -b cookies.txt \
  -F "file=@document.pdf"

# Authentication via httpOnly cookie (set during login)

See the API Reference for more details.

Supported File Formats

PDF

.pdf

Including scanned PDFs with OCR support

Word

.docx, .doc

Microsoft Word documents

Text

.txt

Plain text files

Markdown

.md

Markdown formatted documents

HTML

.html, .htm

Web pages and HTML documents

CSV

.csv

Comma-separated values

Best Practices

File Size

Keep individual files under 50MB for optimal processing

File Names

Use descriptive names like "Q3-Report-2024.pdf" instead of "document1.pdf"

Batch Uploads

Upload multiple files at once by selecting them together

OCR Quality

For scanned PDFs, ensure text is readable (300+ DPI recommended)

Document Processing

After upload, your documents go through several processing steps:

1

Text Extraction

Content is extracted from the document (with OCR for scanned PDFs)

2

Chunking

Text is split into semantic chunks for better retrieval

3

Vectorization

Chunks are converted to embeddings and stored in the vector database

4

Ready

Document is ready for intelligent search and chat

Troubleshooting

Upload fails with "File too large"

Files must be under 50MB. Split large documents or compress PDFs before uploading.

Processing takes a long time

Large documents (especially scanned PDFs requiring OCR) can take several minutes. Check the processing status in the Documents page.

Text not extracted correctly

Ensure scanned PDFs have good quality (300+ DPI). Try converting to text-based PDF first.