A professional tool for creating and managing comprehensive dictionaries using the LIFT (Lexicon Interchange FormaT) standard. This Flask-based application provides full support for LIFT 0.13+ with extensive features designed for lexicographers, linguists, and language documentation specialists.
- Multilingual Support: Every text field supports multiple writing systems simultaneously
- Senses & Subsenses: Organize word meanings hierarchically to capture polysemy and semantic relationships
- Examples & Usage: Rich contextual information with source language examples and translations
- Pronunciation Management: Comprehensive phonetic documentation with IPA transcription, audio files, and TTS integration
- Etymology Tracking: Document word origins and historical development
- Variants & Allomorphs: Document different forms of the same lexeme following SIL Fieldworks approach
- Lexical Relations: Create semantic networks connecting related entries (synonyms, antonyms, hypernyms, etc.)
- Reversals: Essential for bilingual dictionaries β create L2βL1 lookup capability
- Annotations & Messages: Editorial workflow and quality control with per-entry discussion threads
- Edit History & Change Tracking: Every entry save automatically records a revision with a full JSON snapshot of the entry state. Field-level diffs are computed against the previous revision, showing exactly what changed (added, removed, or modified fields). Per-entry revision timelines are displayed directly on the edit page. The Change Analytics dashboard at
/workbench/analyticsaggregates edit activity across the dictionary with date-range filtering, by-field breakdowns, top editors, and a revision timeline chart β giving editors visibility into what changed, when, and by whom.
- Entry Form: Rich multilingual editing with POS inheritance, variant relations, component relations, and subentries
- Worksets: Query-based dynamic collections of entries with curation metadata (status, favorites, notes)
- Bulk Operations: Batch update traits, POS tags, and other fields across multiple entries with preview mode
- Merge & Split: Merge multiple entries into one or split an entry into multiple, with full undo/redo history
- Auto-Save: Automatic form state persistence with undo/redo for entry edits
- Keyboard Shortcuts: Full keyboard navigation and editing workflow
- LIFT Import/Export: Full bidirectional LIFT 0.13+ support with merge/replace modes
- SFM/Shoebox Import: Two-step import with marker auto-detection and interactive mapping UI
- FieldWorks list.xml Import: Abbreviation import from FieldWorks
- HTML Export: Generate browsable static HTML dictionaries with alphabetical navigation
- Markdown Export: Export entries in Markdown format
- Validation Engine: Multiple validation backends β Schematron (XSLT), Hunspell spelling, LanguageTool grammar, IPA pronunciation, real-time field validation
- Validation Rules: Project-specific validation rules with admin UI
- AI Proofreading: LLM-powered proofreading and drafting of entries (BYOK β bring your own API key)
- Data Quality Dashboard: Overview of dictionary health and completeness
- Custom Fields: Extend LIFT to meet specific project needs with FieldWorks-compatible custom fields
- Ranges Editor: Full CRUD for controlled vocabularies (grammatical categories, semantic domains, lexical relations)
- Display Profiles: CSS-based entry rendering system with multiple profiles and custom styling
- Project Settings: Per-project configuration for AI, SMTP, external services, and field visibility defaults
- Project Setup Wizard: Bootstrap new projects with recommended ranges and configurations
- RESTful API: Comprehensive JSON API for all dictionary operations
- Advanced Search: XQuery-powered full-text search across all fields with filters and facets
- Corpus Management: Lucene-based parallel corpus search (concordance) with management UI
- Word Sketch: External ConceptSketch integration for collocation and grammar pattern analysis
- Backup & Restore: Manual and scheduled backups of the BaseX XML database with undo/redo history
- User Management: Role-based access control (ADMIN, MEMBER, VIEWER) with API key authentication
- Swagger API Docs: Interactive API documentation via Flasgger at
/apidocs/ - Docker Support: Full docker-compose setup for all services including the Flask app
- Python 3.8+
- BaseX XML Database (version 9.0+, port 1984)
- PostgreSQL (version 15+, port 5432)
- Redis (for caching, port 6379)
- Java Runtime Environment (for BaseX and Saxon)
- Docker & Docker Compose (recommended for easy setup)
- ConceptSketch (port 8080) β word sketch / collocation analysis service. Clone and run separately to enable the Word Sketch feature.
- corpus-lucene-service (port 8082) β parallel corpus concordance search. Clone and run separately to enable Corpus Management.
- LanguageTool (port 8081) β grammar and style checking (for validation engine)
- Saxon XSLT Processor (included at
tools/saxon/, auto-installed viainstall_saxon.sh) β Schematron XSLT2 validation
Use Docker Compose to start all required services:
docker compose up -dThis starts: Flask app (port 5000), BaseX (ports 1984 TCP, 8984 HTTP), PostgreSQL (port 5432), Redis (port 6379), and a test PostgreSQL instance (port 5433).
To start services individually with the provided scripts:
# Start BaseX
./start-basex.sh
# Or start all services (BaseX + Redis)
./start-services.shEnsure PostgreSQL is running separately (e.g. systemctl start postgresql or via your OS package manager).
git clone <repository-url>
cd flask-apppython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtIf you intend to run the Playwright end-to-end tests, install Node dependencies and download the Playwright browsers. Note that the node_modules/ directory is ignored by Git (run npm ci after cloning to populate it).
# Install Node dependencies (deterministic install)
npm ci
# Install Playwright browsers (Chromium, Firefox, WebKit)
# Use --with-deps on Linux to ensure system packages are installed
npx playwright install --with-deps chromium firefox webkitCopy the example environment file and update the settings:
cp .env.example .envEdit .env file with your configuration β see .env.example for all available variables. Key settings:
# BaseX
BASEX_HOST=localhost
BASEX_PORT=1984
BASEX_USERNAME=admin
BASEX_PASSWORD=admin
BASEX_DATABASE=dictionary
# PostgreSQL (worksets, users, settings)
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=dictionary_analytics
POSTGRES_USER=dict_user
POSTGRES_PASSWORD=dict_pass
# Redis (caching)
REDIS_HOST=localhost
REDIS_PORT=6379
# Flask
SECRET_KEY=your-secret-key-hereOption A β Docker Compose (recommended):
docker compose up -dOption B β Start manually:
# Start BaseX
./start-basex.sh
# Ensure PostgreSQL is running and create the database
createdb dictionary_analytics
# Redis (if installed locally)
redis-serverpython run.pyThe application will be available at http://localhost:5000
- Go to Import/Export β Import LIFT
- Upload your LIFT file to begin working with your dictionary data
- The application supports LIFT 0.13 format with 91% element coverage
- Click on Entries to view your lexicon
- Click on any entry to open the full editor
- Use the comprehensive editing interface to modify or add new entries
Access the Tools β Ranges Editor to manage controlled vocabularies:
- View and edit grammatical information categories
- Manage semantic domains and lexical relations
- Create custom classification systems
Export your dictionary in multiple formats through Import/Export β Export:
- LIFT Export: Full LIFT 0.13+ XML export (single file or dual file + ranges ZIP)
- HTML Export: Generate browsable static HTML pages with CSS-driven entry rendering
- Markdown Export: Export entries in Markdown format for documentation or publishing
The application provides comprehensive LIFT 0.13+ support across all major element categories:
- Entry Elements: All essential entry components (lexeme, citation, variants, alternate forms, notes, fields)
- Sense Elements: Complete sense management with glosses, definitions, semantic domains, and subsenses
- Example Elements: Full example support with translations and source language
- Pronunciation: Complete phonetic documentation with IPA, audio files, and TTS integration
- Etymology: Full etymology tracking with source, form, and gloss
- Custom Fields: Extensive custom field support compatible with FieldWorks/FLEx
| Format | Status | Details |
|---|---|---|
| LIFT (.lift) | Full | Merge or replace modes, with optional .lift-ranges and list.xml |
| SFM/Shoebox | Full | Marker auto-detection with interactive mapping interface |
| FieldWorks list.xml | Full | Abbreviation import |
# Import a LIFT file with optional ranges
python -m scripts.import_lift path/to/lift_file.lift [path/to/lift_ranges.lift-ranges]| Format | Status | Details |
|---|---|---|
| LIFT (.lift) | Full | Single file or dual file + ranges ZIP |
| HTML | Full | Browsable static HTML with CSS rendering and alphabetical navigation |
| Markdown | Full | Markdown format for documentation |
| Kindle (MOBI/AZW3) | Available | Script at tools/scripts/kindle_generator.py β generates Kindle-compatible dictionaries via the REST API |
# Export to a LIFT file
python -m scripts.export_lift path/to/output.lift
# Generate a Kindle dictionary (requires Calibre or KindleGen)
python tools/scripts/kindle_generator.py --format mobi --output my_dict.mobiThe scripts/ and tools/scripts/ directories contain utility scripts for extending and maintaining the application:
| Script | Purpose |
|---|---|
tools/scripts/kindle_generator.py |
Generate Kindle MOBI/AZW3 dictionaries from the API |
tools/scripts/api_client.py |
Programmatic REST API client for batch operations |
tools/scripts/ai_quality_control.py |
AI-powered quality checks on dictionary data |
scripts/import_lift.py |
CLI LIFT import (alternative to web UI) |
scripts/export_lift.py |
CLI LIFT export (alternative to web UI) |
scripts/validate_xml_compatibility.py |
Validate LIFT XML compatibility |
Full interactive API documentation is available at /apidocs/ (Swagger/OpenAPI via Flasgger). Key endpoint groups:
GET /β List entries with pagination and searchGET /{id}β Get a specific entryPOST /β Create a new entryPUT /{id}β Update an existing entryDELETE /{id}β Delete an entry
GET /β Full-text search across all fieldsGET /rangesβ Get range definitions and controlled vocabulariesGET /ranges/{id}β Get values for a specific range
GET /liftβ Export dictionary to LIFT XMLGET /htmlβ Export dictionary to HTMLGET /download/{file}β Download a generated export file
POST /proofreadβ AI proofreading of an entryPOST /draftβ AI drafting of a new entry from descriptionPOST /batch-proofreadβ Batch proofread multiple entries
GET /entry/{id}β Validate a specific entryGET /dictionaryβ Validate the entire dictionaryPOST /checkβ Run validation checks
GET /β List worksetsPOST /β Create a worksetPOST /{id}/entriesβ Manage workset entries
GET /api/statsβ Dictionary statistics and entry countsPOST /api/backup/createβ Create database backupPOST /api/merge-split/mergeβ Merge entriesPOST /api/merge-split/splitβ Split an entryGET/POST /api/corpus/searchβ Lucene-based parallel corpus searchGET /api/profilesβ Display profile managementGET /api/query-builder/fieldsβ Available search fieldsGET/POST /api/bulk/queryβ Query and bulk-operate on entriesPOST /api/auth/loginβ User authenticationGET /api/projects/{id}/validation-rulesβ Validation rulesGET /api/lift/elementsβ LIFT element registry
# Run all tests
pytest
# Run with coverage
pytest --cov=app tests/
# Run JavaScript tests
npm test
# Run end-to-end tests (requires Playwright browsers)
npm run test:e2e
# Format Python code with black
black .
# Lint Python code
flake8
# Type checking
mypy .
# Format JavaScript code
npm run format:js
# Lint JavaScript code
npm run lint:jsThis application is in active development. All core features are operational: full LIFT import/export, entry editing, search, validation, AI assistance, user management, backup/restore, and multiple export formats.
This project is licensed under the MIT License - see the LICENSE file for details.
For technical questions about LIFT format:
For lexicographic resources and guidance:
- Introduction to Lexicography by Ron Moe
For application-specific support, contact your system administrator or open an issue in the repository.
