Skip to main content
JH
Home
Experience
Projects
Lab
Resume
Blog
Live
>_
Back to Projects
ContextBox
Personal knowledge assistant with OCR and semantic search.
Tesseract
pgvector
LLM
FastAPI
View Code
Knowledge Pipeline
Processing Pipeline
Automated
Screenshot
Hello
World
OCR Text
Embedding
→ Stored in pgvector
Semantic Search
<200ms
Search by concept...
Vector Space
100K+ chunks
Query
Doc A
Doc B
Doc C
Processing Flow
Capture
Screenshot
Extract
Tesseract OCR
Embed
Transformers
Retrieve
pgvector
1K+
Screenshots
100K+
Chunks Embedded
<200ms
Search Latency
90%+
OCR Accuracy