FIXMEAPP

Research area

Ethical Data Economy

Cryptographically verifiable, incentivized data contribution — Merkle-tree-based proof-of-concept where users own their data, choose what to share, and earn rewards for participation. Built on the FACT principles.

Ethical Data Economy research investigates how a service booking platform can support a data marketplace where individuals own their data, choose what to share, and earn rewards for participation. The work spans the spectrum from cryptographic primitives that enable verifiable data sharing to the marketplace mechanisms that make participation economically meaningful.

Two completed degree projects approach the cryptographic primitives from complementary angles — predicate-based verification using Ordered Binary Decision Diagrams, and Merkle-tree contribution proofs aligned with the FACT principles. Both tackle the same underlying problem from different angles: how does a centralized social platform verify user attributes for matching without ever storing or exposing the raw data? Together they sit at the intersection of cryptography, market design, and policy.

Thesis 01

Verifiable Data Contribution

Eötvös Loránd University×Aalto University2026Completed · Publishing soon

The thesis presents a SHA-256 Merkle-tree contribution system aligned with the FACT principles (Fairness, Accuracy, Confidentiality, Transparency) introduced by Sohail et al. as a successor to FAIR. The architecture is inspired by the decentralized Wibson protocol but redesigned around a centralized Governor model that fits a single-company booking service.

Three roles: the User (data producer and owner, rewarded per contribution), the Service Provider (the buyer of aggregated signals — never the individual), and the Governor (Fixmeapp — verifies signal validity, signs matches, and intermediates the transaction). Raw user data is pseudo-anonymized within 30 days (demographic data as buckets, locations as geographical polygons) and then encoded in a Merkle tree using SHA-256 hashing. Only the trees — cryptographically resistant to preimage, second-preimage and collision attacks — are queryable from outside.

The proof-of-concept successfully maps 10,000 synthetic users into provable, queryable trees in roughly one second per thousand entries. It is evaluated against quantitative FACT metrics: discrete Gini coefficient for fairness, Dawid-Skene "no-truth" accuracy, ε-differential privacy for confidentiality, and an effectiveness measure for transparency. Structural confidentiality is guaranteed by the hash architecture; the conclusion validates feasibility at scale.

Thesis 02

Predicate-Based Data Sharing

KTH, Royal Institute of Technology2026Completed

This thesis builds a predicate-based verification system inspired by the WibsonTree Protocol. The prototype uses Ordered Binary Decision Diagrams (OBDD) that decouple raw data storage from eligibility verification — allowing the platform to verify user attributes "blindly" without ever loading the underlying data.

The system is benchmarked in a controlled environment against two alternatives: a traditional raw-storage baseline and a Zero-Knowledge Proof system using the Groth16 proving scheme. Findings: predicate-based architectures offer an efficiency-privacy compromise — lighter than ZKP (which has prohibitive computational costs), much safer than raw storage. The privacy tax: approximately 52.5× larger communication payload than the traditional approach. Verifications process at acceptable frequency without critical bottlenecks.

The conclusion is direct: "the storage of raw sensitive data is a legacy practice rather than a necessity in centralized social platforms." Verification of user attributes can be decoupled from storage, and the architecture is practical at scale.

Across the theses in this area

Key themes

  • FACT principles — Fairness, Accuracy, Confidentiality, Transparency (successor to FAIR)
  • Merkle-tree-based contribution proofs (SHA-256) — preimage, second-preimage, collision resistance
  • Predicate-based verification via Ordered Binary Decision Diagrams (OBDD)
  • Selective disclosure: prove eligibility without revealing the underlying data
  • Benchmark study: raw storage vs OBDD predicates vs Zero-Knowledge Proof (Groth16)
  • Centralized Governor model — Wibson protocol adapted to single-company scale
  • Three roles: User (producer + owner + rewardee), Service Provider (buyer), Governor (verifier)
  • Pseudo-anonymization: demographic buckets, geographical polygons, no raw PII queryable
  • Quantitative FACT metrics: Gini, Dawid-Skene accuracy, ε-differential privacy, effectiveness
  • OpenPDS-aligned: Open Personal Data Store patterns for user-controlled disclosure
Collaborate on this area