GraphDB is an experimental graph database engine and command-line interface (CLI) optimized for medical and healthcare applications. It empowers developers, researchers, and healthcare professionals to build, query, and analyze interconnected medical data with high context-awareness. By leveraging a graph-native approach, GraphDB unlocks insights from complex relationships that traditional relational databases struggle to handle, making it an ideal complement to existing Electronic Health Record (EHR) systems.
Note: GraphDB is under active development. APIs and behavior may change before the 1.0 release. In production, ensure encryption, authentication, and access controls are configured to meet HIPAA/GDPR compliance requirements.
- ๐ Why Medical Practices Need GraphDB
- ๐ง Key Benefits
- ๐งน What It Does
- ๐งน Quick Example
- ๐งน Architecture
- ๐ How It Works
- ๐ Complementing Existing EHRs
- ๐งช Example Use Cases
- ๐ Getting Started
- ๐ File Structure
- ๐ฆ Crate/Module Details
- โก Ports, Daemons, and Clusters
- ๐ป Command-Line Interface (CLI) Usage
- ๐ REST API Usage
- ๐๏ธ Storage Backends
- ๐ฎ Future Vision: Advanced Querying & AI Integration
- ๐งฌ Medical Ontology Support
- ๐ข Contributing
- ๐ License
- ๐ Links
Electronic Health Record (EHR) systems typically rely on linear, table-based relational models. However, medical data is inherently interconnected, forming complex relationships that are challenging to represent or query efficiently in traditional systems. For example:
- Patients have encounters with providers ๐ฉโโ๏ธ.
- Encounters generate diagnoses, procedures, notes, and billing codes ๐.
- Medications and prescriptions involve drug interactions and side effects ๐.
- Data flows from devices, labs, insurers, pharmacies, and public health databases ๐.
Complex queries, such as:
- "Which patients are at risk based on recent prescriptions and lab results?"
โ ๏ธ - "Which providers might be undercoding based on their encounter history?" ๐
- "Show a patientโs medical, behavioral, and socioeconomic history over the past 3 years." ๐
are inefficient or infeasible in relational models. GraphDB addresses this gap by providing a graph-native database that excels at modeling and querying these relationships, enabling faster, more intuitive insights for healthcare applications.
GraphDB offers unique advantages for healthcare data management:
- Intuitive Data Modeling: Represents complex medical relationships (e.g., patient-provider interactions, drug interactions) as nodes and edges, making data exploration natural and efficient.
- Powerful Querying: Supports natural language, Cypher, SQL, and GraphQL queries, enabling both technical and non-technical users to extract insights.
- Seamless Integration: Complements existing EHR systems by ingesting data in formats like FHIR, HL7, or CSV, acting as a smart middleware layer.
- Scalable Architecture: Supports standalone, daemonized, or clustered deployments for flexibility and performance.
- Healthcare-Specific Features: Includes built-in support for medical ontologies (e.g., ICD-10, SNOMED) and planned AI-driven analytics for advanced insights.
- Open-Source and Extensible: MIT-licensed with a pluggable architecture, encouraging community contributions and custom extensions.
GraphDB is designed to handle the complexity of medical data through:
- Graph-Native Data Model: Uses vertices (nodes) and edges (relationships) to capture nuanced connections in medical data, such as patient diagnoses or provider interactions.
- Natural Language Querying: Transforms high-level or natural language queries into efficient graph query languages (e.g., Cypher, SQL, GraphQL) for ease of use.
- Flexible Deployment: Offers a powerful CLI for interactive and scripted use, alongside a daemonized REST API for integration with existing systems.
- Pluggable Extensions: Supports healthcare-specific plugins for standards like FHIR, HL7, ICD-10, CPT, and X12.
- Middleware Capabilities: Acts as a context-aware layer for legacy or modern EHR systems, enhancing their relational capabilities.
- Advanced Analytics: Enables graph analytics, risk modeling, explainable AI, and auditable traceability for compliance and insights.
Hereโs a simple Cypher query to find patients diagnosed with Type 2 Diabetes (ICD-10 code E11):
MATCH (p:Patient)-[:HAS_DIAGNOSIS]->(d:Diagnosis)
WHERE d.code = "E11"
RETURN p.name, p.ageThis query traverses the graph to return patient names and ages, demonstrating GraphDBโs ability to handle relational queries efficiently.
GraphDBโs modular, daemonized architecture ensures scalability, performance, and flexibility. Below is a visual representation of its components and their interactions:
+-------------------------------------------------------------------------+
| graphdb-cli (Interactive & Scriptable Client) |
| +---------------------------------------------------------------------+ |
| | Parses CLI commands, transforms queries, dispatches to daemons | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
| (Local Process / HTTP / gRPC)
โ
+-------------------------------------------------------------------------+
| graphdb-rest_api (REST API Gateway) |
| +---------------------------------------------------------------------+ |
| | Exposes RESTful endpoints for programmatic access | |
| | Handles authentication, routing, and data serialization | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
| (gRPC / Internal IPC)
โ
+-------------------------------------------------------------------------+
| graphdb-daemon (Core Graph Processing Daemon) |
| +---------------------------------------------------------------------+ |
| | Manages graph state, executes queries, handles concurrency | |
| | Uses graphdb-lib for graph modeling and query execution | |
| | Supports single-instance or clustered deployments | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
| (Internal IPC / Storage Protocol)
โ
+-------------------------------------------------------------------------+
| graphdb-storage-daemon (Pluggable Storage Backend) |
| +---------------------------------------------------------------------+ |
| | Manages persistent storage, indexing, and transactional integrity | |
| | Supports multiple backends (Postgres, Redis, RocksDB, Sled) | |
| +---------------------------------------------------------------------+ |
This architecture allows independent scaling of components, supporting both lightweight local deployments and distributed, high-performance clusters.
GraphDB processes data through a streamlined workflow:
- Input Parsing: Queries from the CLI or REST API (natural language, Cypher, SQL, or GraphQL) are parsed and transformed into an internal graph traversal representation.
- Daemonized Execution: The
graphdb-daemonhandles query execution, maintains graph state, and supports concurrent access. It leverages in-memory caching for performance. - Storage Management: The
graphdb-storage-daemonabstracts persistent storage, supporting multiple backends (e.g., Postgres, RocksDB) via pluggable interfaces. - Integration: GraphDB can operate standalone for graph analysis or integrate into existing healthcare IT pipelines, enhancing data interoperability.
GraphDB enhances, rather than replaces, existing EHR systems by:
- Ingesting Data: Supports formats like CSV, HL7, FHIR, or direct Postgres connections, making it easy to import data from EHRs.
- Transforming Data: Converts structured and semi-structured data into a queryable graph model, preserving relationships.
- Enabling Insights: Facilitates temporal and semantic joins across disparate datasets, uncovering insights hidden in relational structures (e.g., linking patient records with lab results and billing codes).
GraphDBโs graph-native approach unlocks powerful healthcare applications:
- Clinical Decision Support: Identify drug-allergy interactions or suggest treatment paths by traversing patient history graphs in real-time. ๐ฉบ
- Billing Optimization: Detect missed CPT coding opportunities or fraudulent billing patterns using graph-based anomaly detection. ๐ฐ
- Patient Risk Modeling: Build longitudinal graphs of patient medical, behavioral, and socioeconomic factors for predictive analytics and proactive care. ๐
- Security and Compliance: Visualize user access logs as graphs to ensure HIPAA/GDPR compliance and detect unauthorized access. ๐
- Research and Epidemiology: Analyze disease propagation networks, identify clinical trial cohorts, or study social determinants of health. ๐ฌ
Before building GraphDB, ensure the following are installed:
- Rust: Version 1.72 or higher (
rustup install 1.72). - Cargo: Included with Rust for building and managing dependencies.
- Git: For cloning the repository.
- Optional Backends (if used):
- Postgres: For relational storage.
- Redis: For caching.
- RocksDB/Sled: For embedded key-value storage.
-
Clone the repository:
git clone [https://github.com/dmitryro/graphdb.git](https://github.com/dmitryro/graphdb.git) cd graphdb -
Build the CLI executable:
cargo build --workspace --release --bin graphdb-cli
The compiled binary will be located at
./target/release/graphdb-cli.
GraphDB supports multiple interaction modes:
- Interactive CLI: For exploratory querying and management.
- Scripted CLI: For automation and batch processing.
- REST API: For programmatic integration with other applications.
-
Start Interactive CLI:
./target/release/graphdb-cli --cli
Enter commands like
start,stop,status, orrest graph-query. -
Start a Single Graph Daemon:
./target/release/graphdb-cli start --port 9001
Default port is 8080 if
--portis omitted. -
Start a Daemon Cluster:
./target/release/graphdb-cli start --cluster 9001-9003
Launches daemons on ports 9001โ9003.
-
Start REST API and Storage Daemon:
./target/release/graphdb-cli start --listen-port 8082 --storage-port 8085
REST API runs on port 8082, storage daemon on 8085.
-
Stop Components:
-
Stop all components:
./target/release/graphdb-cli stop
-
Stop specific components:
./target/release/graphdb-cli stop rest ./target/release/graphdb-cli stop daemon --port 9001
- Stop the Storage Daemon by port:
./target/release/graphdb-cli stop storage --port 8085
-
-
Direct CLI Query:
./target/release/graphdb-cli --query "MATCH (n) RETURN n" -
Interactive CLI Query:
graphdb-cli> rest graph-query "MATCH (p:Patient) RETURN p.name LIMIT 5" -
REST API Query:
curl -X POST [http://127.0.0.1:8082/api/v1/query](http://127.0.0.1:8082/api/v1/query) \ -H "Content-Type: application/json" \ -d '{"query":"MATCH (n:Person {name: \"Alice\"}) RETURN n"}'
The project is organized for modularity and maintainability:
graphdb-lib/๐ง : Core graph engine, data structures, and query parsing.server/๐ป: CLI application (graphdb-cli) and its components.daemon-api/โ๏ธ: Interfaces for daemon communication (e.g., gRPC).rest-api/๐: RESTful API gateway for external access.storage-daemon-server/๐๏ธ: Pluggable storage backend daemon.proto/๐ฆ: gRPC service definitions for distributed setups.models/medical/โ๏ธ: Healthcare-specific graph structures and ontologies.
- Purpose: Core graph engine with data structures (nodes, edges), traversal algorithms (BFS, DFS, shortest path), and query parsing for Cypher, SQL, and GraphQL.
- Features:
- Efficient in-memory graph representation.
- Schema management for nodes and relationships.
- Query execution engine with support for multiple query languages.
- Purpose: Houses the
graphdb-clibinary for interactive and scripted use. - Subcomponents (
server/src/cli/):cli.rs: Parses command-line arguments and dispatches commands.commands.rs: Defines CLI subcommands using theclapcrate.handlers.rs: Implements logic for commands (e.g., start/stop daemons).interactive.rs: Manages the interactive CLI shell.config.rs: Handles configuration (ports, data directories) via YAML/TOML.daemon_management.rs: Manages daemon lifecycle (spawning, monitoring, stopping).help_display.rs: Generates detailed help messages for CLI commands.
- Purpose: Provides programmatic interfaces for controlling
graphdb-daemoninstances. - Features: Uses gRPC for efficient, language-agnostic communication between components.
- Purpose: Exposes RESTful endpoints for programmatic access.
- Key Endpoints:
GET /api/v1/health: Checks system status.POST /api/v1/query: Executes graph queries (Cypher, SQL, GraphQL).POST /api/v1/start/port/{port}: Starts a single daemon.POST /api/v1/start/cluster/{start}-{end}: Starts a daemon cluster.POST /api/v1/stop: Shuts down components (optional parameters for specific daemons).POST /api/v1/ingest(Planned): Ingests data in formats like FHIR.GET /api/v1/nodes/{id}(Planned): Retrieves a specific node.GET /api/v1/relationships/{id}(Planned): Retrieves a specific relationship.
- Purpose: Manages persistent storage with a pluggable architecture.
- Supported Backends: Postgres, Redis, RocksDB, Sled.
- Features: Ensures data durability, indexing, and transactional integrity.
- Purpose: Defines gRPC Protobuf messages and services for distributed communication.
- Purpose: Provides healthcare-specific graph structures and ontologies for context-aware queries.
GraphDB components run as independent daemons, communicating via defined ports:
| Component | Default Port | Description |
|---|---|---|
graphdb-daemon |
8080 | Core graph processing daemon |
graphdb-rest_api |
8082 | REST API gateway |
graphdb-storage-daemon |
8085 | Persistent storage daemon |
- Single Instance: Suitable for local development or small-scale deployments.
- Cluster Mode: Supports distributed processing across multiple ports (e.g., 9001โ9003) for scalability.
- Use Cases:
- Interactive querying: CLI.
- Automation/scripting: REST API.
- Batch ingestion: CLI + Daemon.
- Distributed processing: gRPC (planned).
The graphdb-cli binary provides flexible interaction options:
./target/release/graphdb-cli --cli # Start interactive shell
./target/release/graphdb-cli start --port 9001 # Start single daemon
./target/release/graphdb-cli start --cluster 9001-9003 # Start daemon cluster
./target/release/graphdb-cli stop # Stop all components
./target/release/graphdb-cli view-graph --graph-id 42 # View graph by ID
./target/release/graphdb-cli --query "MATCH (n) RETURN n" # Execute direct queryInteract with GraphDB programmatically via the REST API:
# Check system health
curl [http://127.0.0.1:8082/api/v1/health](http://127.0.0.1:8082/api/v1/health)
# Execute a graph query
curl -X POST [http://127.0.0.1:8082/api/v1/query](http://127.0.0.1:8082/api/v1/query) \
-H "Content-Type: application/json" \
-d '{"query":"MATCH (n:Person {name: \"Alice\"}) RETURN n"}'GraphDB supports pluggable storage backends:
- Postgres: Relational persistence and SQL queries.
- Redis: High-speed caching for transient data.
- RocksDB: Embedded key-value store for local performance.
- Sled: Lock-free, embedded database for Rust. Custom backends can be implemented via trait interfaces.
GraphDB aims to evolve into a more intelligent platform:
- Natural Language Processing (NLP): Enhanced support for conversational queries, enabling non-technical users to interact with the database.
- AI-Driven Insights: Integration with machine learning models for predictive analytics, such as identifying at-risk patients or optimizing clinical workflows.
- Graph Visualization: A planned UI for exploring and visualizing graph data interactively.
- Distributed gRPC: Enhanced support for multi-language, distributed deployments.
GraphDB supports key healthcare standards:
- FHIR (STU3/STU4)
- HL7 (v2/v3)
- CPT, ICD-10, LOINC, SNOMED
- X12 (837/835 claims)
- Planned: Retrieval-Augmented Generation (RAG) for NLP queries, time-series support for EEG/EKG data.
We welcome contributions to enhance GraphDB:
- โ Cypher query support (complete)
- NLP pipeline integration
- gRPC enhancements
- Graph explorer UI
Submit pull requests or report issues at https://github.com/dmitryro/graphdb/issues.
MIT License (see LICENSE).
- GitHub: https://github.com/dmitryro/graphdb
- Issues: https://github.com/dmitryro/graphdb/issues
- Documentation: https://docs.rs/graphdb
- Crates.io: https://crates.io/crates/graphdb