Skip to content

dmitryro/graphdb-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

GraphDB

Build Status Rust Crates.io Docs.rs License Status

GraphDB is an experimental graph database engine and command-line interface (CLI) optimized for medical and healthcare applications. It empowers developers, researchers, and healthcare professionals to build, query, and analyze interconnected medical data with high context-awareness. By leveraging a graph-native approach, GraphDB unlocks insights from complex relationships that traditional relational databases struggle to handle, making it an ideal complement to existing Electronic Health Record (EHR) systems.

Note: GraphDB is under active development. APIs and behavior may change before the 1.0 release. In production, ensure encryption, authentication, and access controls are configured to meet HIPAA/GDPR compliance requirements.

๐Ÿ“ Table of Contents

๐Ÿš‘ Why Medical Practices Need GraphDB

Electronic Health Record (EHR) systems typically rely on linear, table-based relational models. However, medical data is inherently interconnected, forming complex relationships that are challenging to represent or query efficiently in traditional systems. For example:

  • Patients have encounters with providers ๐Ÿ‘ฉโ€โš•๏ธ.
  • Encounters generate diagnoses, procedures, notes, and billing codes ๐Ÿ“.
  • Medications and prescriptions involve drug interactions and side effects ๐Ÿ’Š.
  • Data flows from devices, labs, insurers, pharmacies, and public health databases ๐Ÿ“Š.

Complex queries, such as:

  • "Which patients are at risk based on recent prescriptions and lab results?" โš ๏ธ
  • "Which providers might be undercoding based on their encounter history?" ๐Ÿ“‰
  • "Show a patientโ€™s medical, behavioral, and socioeconomic history over the past 3 years." ๐Ÿ“ˆ

are inefficient or infeasible in relational models. GraphDB addresses this gap by providing a graph-native database that excels at modeling and querying these relationships, enabling faster, more intuitive insights for healthcare applications.

๐Ÿง  Key Benefits

GraphDB offers unique advantages for healthcare data management:

  • Intuitive Data Modeling: Represents complex medical relationships (e.g., patient-provider interactions, drug interactions) as nodes and edges, making data exploration natural and efficient.
  • Powerful Querying: Supports natural language, Cypher, SQL, and GraphQL queries, enabling both technical and non-technical users to extract insights.
  • Seamless Integration: Complements existing EHR systems by ingesting data in formats like FHIR, HL7, or CSV, acting as a smart middleware layer.
  • Scalable Architecture: Supports standalone, daemonized, or clustered deployments for flexibility and performance.
  • Healthcare-Specific Features: Includes built-in support for medical ontologies (e.g., ICD-10, SNOMED) and planned AI-driven analytics for advanced insights.
  • Open-Source and Extensible: MIT-licensed with a pluggable architecture, encouraging community contributions and custom extensions.

๐Ÿงน What It Does

GraphDB is designed to handle the complexity of medical data through:

  • Graph-Native Data Model: Uses vertices (nodes) and edges (relationships) to capture nuanced connections in medical data, such as patient diagnoses or provider interactions.
  • Natural Language Querying: Transforms high-level or natural language queries into efficient graph query languages (e.g., Cypher, SQL, GraphQL) for ease of use.
  • Flexible Deployment: Offers a powerful CLI for interactive and scripted use, alongside a daemonized REST API for integration with existing systems.
  • Pluggable Extensions: Supports healthcare-specific plugins for standards like FHIR, HL7, ICD-10, CPT, and X12.
  • Middleware Capabilities: Acts as a context-aware layer for legacy or modern EHR systems, enhancing their relational capabilities.
  • Advanced Analytics: Enables graph analytics, risk modeling, explainable AI, and auditable traceability for compliance and insights.

๐Ÿงน Quick Example

Hereโ€™s a simple Cypher query to find patients diagnosed with Type 2 Diabetes (ICD-10 code E11):

MATCH (p:Patient)-[:HAS_DIAGNOSIS]->(d:Diagnosis)
WHERE d.code = "E11"
RETURN p.name, p.age

This query traverses the graph to return patient names and ages, demonstrating GraphDBโ€™s ability to handle relational queries efficiently.

๐Ÿงน Architecture

GraphDBโ€™s modular, daemonized architecture ensures scalability, performance, and flexibility. Below is a visual representation of its components and their interactions:

+-------------------------------------------------------------------------+
|                  graphdb-cli (Interactive & Scriptable Client)          |
| +---------------------------------------------------------------------+ |
| | Parses CLI commands, transforms queries, dispatches to daemons      | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
                                     | (Local Process / HTTP / gRPC)
                                     โ†“
+-------------------------------------------------------------------------+
|                  graphdb-rest_api (REST API Gateway)                    |
| +---------------------------------------------------------------------+ |
| | Exposes RESTful endpoints for programmatic access                  | |
| | Handles authentication, routing, and data serialization             | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
                                     | (gRPC / Internal IPC)
                                     โ†“
+-------------------------------------------------------------------------+
|                  graphdb-daemon (Core Graph Processing Daemon)          |
| +---------------------------------------------------------------------+ |
| | Manages graph state, executes queries, handles concurrency          | |
| | Uses graphdb-lib for graph modeling and query execution            | |
| | Supports single-instance or clustered deployments                  | |
| +---------------------------------------------------------------------+ |
+------------------------------------|------------------------------------+
                                     | (Internal IPC / Storage Protocol)
                                     โ†“
+-------------------------------------------------------------------------+
|                  graphdb-storage-daemon (Pluggable Storage Backend)     |
| +---------------------------------------------------------------------+ |
| | Manages persistent storage, indexing, and transactional integrity  | |
| | Supports multiple backends (Postgres, Redis, RocksDB, Sled)         | |
| +---------------------------------------------------------------------+ |

This architecture allows independent scaling of components, supporting both lightweight local deployments and distributed, high-performance clusters.

๐Ÿ”Œ How It Works

GraphDB processes data through a streamlined workflow:

  1. Input Parsing: Queries from the CLI or REST API (natural language, Cypher, SQL, or GraphQL) are parsed and transformed into an internal graph traversal representation.
  2. Daemonized Execution: The graphdb-daemon handles query execution, maintains graph state, and supports concurrent access. It leverages in-memory caching for performance.
  3. Storage Management: The graphdb-storage-daemon abstracts persistent storage, supporting multiple backends (e.g., Postgres, RocksDB) via pluggable interfaces.
  4. Integration: GraphDB can operate standalone for graph analysis or integrate into existing healthcare IT pipelines, enhancing data interoperability.

๐ŸŒ Complementing Existing EHRs

GraphDB enhances, rather than replaces, existing EHR systems by:

  • Ingesting Data: Supports formats like CSV, HL7, FHIR, or direct Postgres connections, making it easy to import data from EHRs.
  • Transforming Data: Converts structured and semi-structured data into a queryable graph model, preserving relationships.
  • Enabling Insights: Facilitates temporal and semantic joins across disparate datasets, uncovering insights hidden in relational structures (e.g., linking patient records with lab results and billing codes).

๐Ÿงช Example Use Cases

GraphDBโ€™s graph-native approach unlocks powerful healthcare applications:

  • Clinical Decision Support: Identify drug-allergy interactions or suggest treatment paths by traversing patient history graphs in real-time. ๐Ÿฉบ
  • Billing Optimization: Detect missed CPT coding opportunities or fraudulent billing patterns using graph-based anomaly detection. ๐Ÿ’ฐ
  • Patient Risk Modeling: Build longitudinal graphs of patient medical, behavioral, and socioeconomic factors for predictive analytics and proactive care. ๐Ÿ“Š
  • Security and Compliance: Visualize user access logs as graphs to ensure HIPAA/GDPR compliance and detect unauthorized access. ๐Ÿ”’
  • Research and Epidemiology: Analyze disease propagation networks, identify clinical trial cohorts, or study social determinants of health. ๐Ÿ”ฌ

๐Ÿš€ Getting Started

๐Ÿ“‹ Installation Prerequisites

Before building GraphDB, ensure the following are installed:

  • Rust: Version 1.72 or higher (rustup install 1.72).
  • Cargo: Included with Rust for building and managing dependencies.
  • Git: For cloning the repository.
  • Optional Backends (if used):
    • Postgres: For relational storage.
    • Redis: For caching.
    • RocksDB/Sled: For embedded key-value storage.

๐Ÿ› ๏ธ Building GraphDB

  1. Clone the repository:

    git clone [https://github.com/dmitryro/graphdb.git](https://github.com/dmitryro/graphdb.git)
    cd graphdb
  2. Build the CLI executable:

    cargo build --workspace --release --bin graphdb-cli

    The compiled binary will be located at ./target/release/graphdb-cli.

๐Ÿš€ Running GraphDB Components

GraphDB supports multiple interaction modes:

  • Interactive CLI: For exploratory querying and management.
  • Scripted CLI: For automation and batch processing.
  • REST API: For programmatic integration with other applications.

CLI Commands

  • Start Interactive CLI:

    ./target/release/graphdb-cli --cli

    Enter commands like start, stop, status, or rest graph-query.

  • Start a Single Graph Daemon:

    ./target/release/graphdb-cli start --port 9001

    Default port is 8080 if --port is omitted.

  • Start a Daemon Cluster:

    ./target/release/graphdb-cli start --cluster 9001-9003

    Launches daemons on ports 9001โ€“9003.

  • Start REST API and Storage Daemon:

    ./target/release/graphdb-cli start --listen-port 8082 --storage-port 8085

    REST API runs on port 8082, storage daemon on 8085.

  • Stop Components:

    • Stop all components:

      ./target/release/graphdb-cli stop
    • Stop specific components:

      ./target/release/graphdb-cli stop rest
      ./target/release/graphdb-cli stop daemon --port 9001
      • Stop the Storage Daemon by port:
      ./target/release/graphdb-cli stop storage --port 8085

Querying Data

  • Direct CLI Query:

    ./target/release/graphdb-cli --query "MATCH (n) RETURN n"
  • Interactive CLI Query:

    graphdb-cli> rest graph-query "MATCH (p:Patient) RETURN p.name LIMIT 5"
    
  • REST API Query:

    curl -X POST [http://127.0.0.1:8082/api/v1/query](http://127.0.0.1:8082/api/v1/query) \
      -H "Content-Type: application/json" \
      -d '{"query":"MATCH (n:Person {name: \"Alice\"}) RETURN n"}'

๐Ÿ“‚ File Structure

The project is organized for modularity and maintainability:

  • graphdb-lib/ ๐Ÿง : Core graph engine, data structures, and query parsing.
  • server/ ๐Ÿ’ป: CLI application (graphdb-cli) and its components.
  • daemon-api/ โš™๏ธ: Interfaces for daemon communication (e.g., gRPC).
  • rest-api/ ๐ŸŒ: RESTful API gateway for external access.
  • storage-daemon-server/ ๐Ÿ—„๏ธ: Pluggable storage backend daemon.
  • proto/ ๐Ÿ“ฆ: gRPC service definitions for distributed setups.
  • models/medical/ โš•๏ธ: Healthcare-specific graph structures and ontologies.

๐Ÿ“ฆ Crate/Module Details

graphdb-lib ๐Ÿง 

  • Purpose: Core graph engine with data structures (nodes, edges), traversal algorithms (BFS, DFS, shortest path), and query parsing for Cypher, SQL, and GraphQL.
  • Features:
    • Efficient in-memory graph representation.
    • Schema management for nodes and relationships.
    • Query execution engine with support for multiple query languages.

server ๐Ÿ’ป

  • Purpose: Houses the graphdb-cli binary for interactive and scripted use.
  • Subcomponents (server/src/cli/):
    • cli.rs: Parses command-line arguments and dispatches commands.
    • commands.rs: Defines CLI subcommands using the clap crate.
    • handlers.rs: Implements logic for commands (e.g., start/stop daemons).
    • interactive.rs: Manages the interactive CLI shell.
    • config.rs: Handles configuration (ports, data directories) via YAML/TOML.
    • daemon_management.rs: Manages daemon lifecycle (spawning, monitoring, stopping).
    • help_display.rs: Generates detailed help messages for CLI commands.

daemon-api โš™๏ธ

  • Purpose: Provides programmatic interfaces for controlling graphdb-daemon instances.
  • Features: Uses gRPC for efficient, language-agnostic communication between components.

rest-api ๐ŸŒ

  • Purpose: Exposes RESTful endpoints for programmatic access.
  • Key Endpoints:
    • GET /api/v1/health: Checks system status.
    • POST /api/v1/query: Executes graph queries (Cypher, SQL, GraphQL).
    • POST /api/v1/start/port/{port}: Starts a single daemon.
    • POST /api/v1/start/cluster/{start}-{end}: Starts a daemon cluster.
    • POST /api/v1/stop: Shuts down components (optional parameters for specific daemons).
    • POST /api/v1/ingest (Planned): Ingests data in formats like FHIR.
    • GET /api/v1/nodes/{id} (Planned): Retrieves a specific node.
    • GET /api/v1/relationships/{id} (Planned): Retrieves a specific relationship.

storage-daemon-server ๐Ÿ—„๏ธ

  • Purpose: Manages persistent storage with a pluggable architecture.
  • Supported Backends: Postgres, Redis, RocksDB, Sled.
  • Features: Ensures data durability, indexing, and transactional integrity.

proto ๐Ÿ“ฆ

  • Purpose: Defines gRPC Protobuf messages and services for distributed communication.

models/medical โš•๏ธ

  • Purpose: Provides healthcare-specific graph structures and ontologies for context-aware queries.

โšก Ports, Daemons, and Clusters

GraphDB components run as independent daemons, communicating via defined ports:

Component Default Port Description
graphdb-daemon 8080 Core graph processing daemon
graphdb-rest_api 8082 REST API gateway
graphdb-storage-daemon 8085 Persistent storage daemon
  • Single Instance: Suitable for local development or small-scale deployments.
  • Cluster Mode: Supports distributed processing across multiple ports (e.g., 9001โ€“9003) for scalability.
  • Use Cases:
    • Interactive querying: CLI.
    • Automation/scripting: REST API.
    • Batch ingestion: CLI + Daemon.
    • Distributed processing: gRPC (planned).

๐Ÿ’ป Command-Line Interface (CLI) Usage

The graphdb-cli binary provides flexible interaction options:

./target/release/graphdb-cli --cli                        # Start interactive shell
./target/release/graphdb-cli start --port 9001           # Start single daemon
./target/release/graphdb-cli start --cluster 9001-9003   # Start daemon cluster
./target/release/graphdb-cli stop                        # Stop all components
./target/release/graphdb-cli view-graph --graph-id 42    # View graph by ID
./target/release/graphdb-cli --query "MATCH (n) RETURN n" # Execute direct query

๐ŸŒ REST API Usage

Interact with GraphDB programmatically via the REST API:

# Check system health
curl [http://127.0.0.1:8082/api/v1/health](http://127.0.0.1:8082/api/v1/health)

# Execute a graph query
curl -X POST [http://127.0.0.1:8082/api/v1/query](http://127.0.0.1:8082/api/v1/query) \
  -H "Content-Type: application/json" \
  -d '{"query":"MATCH (n:Person {name: \"Alice\"}) RETURN n"}'

๐Ÿ—„๏ธ Storage Backends

GraphDB supports pluggable storage backends:

  • Postgres: Relational persistence and SQL queries.
  • Redis: High-speed caching for transient data.
  • RocksDB: Embedded key-value store for local performance.
  • Sled: Lock-free, embedded database for Rust. Custom backends can be implemented via trait interfaces.

๐Ÿ”ฎ Future Vision: Advanced Querying & AI Integration

GraphDB aims to evolve into a more intelligent platform:

  • Natural Language Processing (NLP): Enhanced support for conversational queries, enabling non-technical users to interact with the database.
  • AI-Driven Insights: Integration with machine learning models for predictive analytics, such as identifying at-risk patients or optimizing clinical workflows.
  • Graph Visualization: A planned UI for exploring and visualizing graph data interactively.
  • Distributed gRPC: Enhanced support for multi-language, distributed deployments.

๐Ÿงฌ Medical Ontology Support

GraphDB supports key healthcare standards:

  • FHIR (STU3/STU4)
  • HL7 (v2/v3)
  • CPT, ICD-10, LOINC, SNOMED
  • X12 (837/835 claims)
  • Planned: Retrieval-Augmented Generation (RAG) for NLP queries, time-series support for EEG/EKG data.

๐Ÿ“ข Contributing

We welcome contributions to enhance GraphDB:

  • โœ… Cypher query support (complete)
  • NLP pipeline integration
  • gRPC enhancements
  • Graph explorer UI

Submit pull requests or report issues at https://github.com/dmitryro/graphdb/issues.

๐Ÿ“œ License

MIT License (see LICENSE).

๐ŸŒ Links