Uploading Documents
Learn how to upload and manage your documents
Upload Methods
1. Web Interface (Drag & Drop)
The easiest way to upload documents is through the web interface:
- 1. Navigate to the Documents page
- 2. Click "Upload" or drag files directly into the upload area
- 3. Select one or more files from your computer
- 4. Wait for processing to complete (you'll see a progress indicator)
- 5. Your documents are now ready for chatting!
2. API Upload
For programmatic uploads, use the REST API:
curl -X POST https://api.doculume.com/api/v1/documents/upload \
-b cookies.txt \
-F "file=@document.pdf"
# Authentication via httpOnly cookie (set during login)See the API Reference for more details.
Supported File Formats
Including scanned PDFs with OCR support
Word
.docx, .doc
Microsoft Word documents
Text
.txt
Plain text files
Markdown
.md
Markdown formatted documents
HTML
.html, .htm
Web pages and HTML documents
CSV
.csv
Comma-separated values
Best Practices
File Size
Keep individual files under 50MB for optimal processing
File Names
Use descriptive names like "Q3-Report-2024.pdf" instead of "document1.pdf"
Batch Uploads
Upload multiple files at once by selecting them together
OCR Quality
For scanned PDFs, ensure text is readable (300+ DPI recommended)
Document Processing
After upload, your documents go through several processing steps:
Text Extraction
Content is extracted from the document (with OCR for scanned PDFs)
Chunking
Text is split into semantic chunks for better retrieval
Vectorization
Chunks are converted to embeddings and stored in the vector database
Ready
Document is ready for intelligent search and chat
Troubleshooting
Upload fails with "File too large"
Files must be under 50MB. Split large documents or compress PDFs before uploading.
Processing takes a long time
Large documents (especially scanned PDFs requiring OCR) can take several minutes. Check the processing status in the Documents page.
Text not extracted correctly
Ensure scanned PDFs have good quality (300+ DPI). Try converting to text-based PDF first.