Software Documentation
Uploading Documents
The Upload tab is where you load your text files into Archivist's database. During this process, your documents are broken into smaller passages, processed for search, and stored locally so the AI can find relevant information when you ask questions.
What Happens During Upload
When you upload a document, Archivist does three things:
- Splits your document into smaller passages (called "chunks") so the AI can search through them efficiently.
- Processes each passage into a format the AI understands (this is called "embedding" — you don't need to worry about the details).
- Stores everything in your local database, ready to be searched.
This all happens on your computer. Nothing is uploaded to the internet.
Choosing a Chunking Strategy
Archivist offers three strategies for splitting your documents. You'll see these as tabs at the top of the Upload page.
Basic (Recommended for most files)
This is the default and works well for most documents. It splits your text into evenly-sized passages based on a token count.
You can adjust two settings:
- Chunk Size (100–4,000 tokens) — How large each passage should be. The default of 512 tokens (roughly 300–400 words) works well for most documents. Smaller chunks give more precise search results; larger chunks preserve more context.
- Chunk Overlap (0–500 tokens) — How much text is shared between adjacent passages. Overlap prevents important sentences from being split across two passages. The default of 100 tokens is a good starting point.
Tip
If you're not sure what settings to use, the defaults work well. You can always re-upload a document with different settings later.
Spreadsheet (For CSV files)
Use this when uploading CSV spreadsheet data. Each row in the spreadsheet becomes its own passage — no splitting or overlap needed. Empty rows are automatically skipped.
This also works with Markdown-formatted tables.
Custom Delimiter (For structured text)
Use this when your document has clear section markers — like ###, ---, or any other separator. Archivist will split the text wherever it finds your chosen delimiter.
You can type in any delimiter, or use one of the quick-select buttons for common options.
Organizing with File Sets
File sets are like folders within Archivist. They let you group related documents together so you can search within just that group later.
When uploading, you can assign your files to one or more file sets using the file set selector. For example:
- A set called "Q4 Reports" for quarterly business documents
- A set called "Biology 101" for course materials
- A set called "Legal — Smith Case" for case files
Every document is automatically included in an "All Docs" set, so you can always search everything at once if you prefer.
Tip
You can add or change file set assignments later from the Inspect tab without re-uploading your documents.
Uploading Files
- Select your chunking strategy and adjust settings if needed.
- Drag and drop your text files onto the upload area, or click to browse. The Upload tab accepts
.txt,.md, and.csvfiles. - Assign a file set if you want to organize the documents (optional).
- Click Upload All to start processing.
A status bar will appear showing the progress — which file is being processed and how many passages have been embedded so far. You can navigate to other tabs while the upload runs in the background.
When the upload finishes, you'll see a results table showing:
- How many passages were created from each file
- Which strategy was used
- Any errors (for example, if a file with the same name was already in the database)
Handling Duplicates
If you try to upload a file that already exists in the database (matched by filename), Archivist will flag it as a duplicate and skip it. To re-upload a file with different settings, first delete it from the Browse tab, then upload again.
Next Steps
Once your documents are uploaded, you can:
- Ask questions about them in the Query tab
- Browse your library to see what's in the database
- Inspect individual passages to review or tag them