Appearance
Centauron Network Architecture
The Centauron Network is built on a robust, decentralized architecture designed to facilitate secure, collaborative AI development in digital pathology. Evolving through several iterations, the final system design (Iteration 4) combines a pragmatic monolithic node structure with a sophisticated blockchain-based communication and trust layer, ensuring data sovereignty, intellectual property protection, and scalable collaboration.
Core Principles
The architecture is founded on:
- Decentralized Data Sovereignty: Data owners retain full control over their data, which resides locally on their autonomous Centauron nodes.
- Peer-to-Peer Network: Centauron nodes communicate directly with each other, minimizing reliance on central intermediaries.
- Federated Computing: Computations, especially AI model training and validation, occur at the data source, preventing raw data transfer.
Centauron Node Architecture
Each Centauron node operates as a self-contained unit, primarily implemented as a monolithic web application using the Django framework. This design choice prioritizes ease of deployment and maintenance, making participation accessible.
Key components within a node include:
- Monolithic Application: Handles core functionalities like project management, challenge execution, data management, and communication.
- Kubernetes Integration: For executing federated computations (e.g., AI model inference) in a secure, isolated, and resource-managed environment. This ensures that external code runs safely without uncontrolled data egress.
- Local Data Storage: Manages WSIs and annotations, ensuring data locality.
Communication Layer
Centauron employs a hybrid communication model:
- Broadcast Messages (Blockchain & IPFS): For network-wide information dissemination, such as announcing new nodes or users, and for logging critical actions.
- Messages are stored in IPFS (InterPlanetary File System), a distributed file system.
- The unique Content Identifier (CID) of the IPFS message is then written to a new block on the network's permissioned blockchain, powered by Hyperledger Besu.
- This creates a decentralized, chronological, and immutable record of network state, replacing the need for a central authority for node registration.
- Private Messages: Direct, secure communication between two Centauron nodes for specific interactions (e.g., sending a submission from a Hub to a Project-PI, or from a Project-PI to a Data Owner).
Blockchain Integration (Hyperledger Besu)
The network utilizes a permissioned blockchain, implemented with Hyperledger Besu (an Ethereum client), for several critical functions:
- Immutable Logging: All relevant data usage, intellectual property transfers, and key network events are permanently recorded on the blockchain, providing a transparent and auditable trail.
- Identity Management: New nodes register themselves on the blockchain via broadcast messages, creating a decentralized directory of network participants.
- Smart Contracts: While not fully implemented in the prototype for data economy, the architecture supports smart contracts for programmatic enforcement of licensing agreements and automated financial transactions (e.g., tokenized assets like NFTs for data and AI apps).
- Consensus Mechanism: Leverages a Proof-of-Authority (PoA) variant (like IBFT 2.0 or QBFT) for efficient and secure consensus among known validators.
Authentication and Authorization
Security is paramount in the Centauron Network:
- Node Authentication (mTLS): Centauron nodes authenticate each other using mutual Transport Layer Security (mTLS) with X.509 certificates issued by a trusted Certificate Authority (CA). This ensures secure, encrypted communication between nodes.
- User Authentication: Users authenticate to their local Centauron node (e.g., via Keycloak).
- Authorization (AllowList): Access to data and functionalities is controlled by an
AllowList
mechanism, ensuring that only authorized users and nodes can interact with specific resources. - Decentralized DIDs: Entities (WSIs, projects, users, nodes) are uniquely identified across the network using Decentralized Identifiers (DIDs), which incorporate the originating node's identifier and a UUID.
Network Structure and Access Points
- Autonomous Nodes: Each participant operates their own Centauron node, maintaining data sovereignty.
- Delegated Nodes: For data owners without the technical expertise to run a node, delegated nodes allow participation while preserving data control.
- Hubs: Centralized access points (which can be operated independently) for discovering datasets, challenges, and AI applications. They facilitate submissions and provide aggregated results, acting as a bridge for broader network interaction.
This architecture provides a robust, secure, and scalable foundation for collaborative AI development in digital pathology, addressing critical challenges related to data privacy, intellectual property, and trust.