Newnal Data Market
Detailed description of Newnal Data Market
Table of Contents
Introduction
With the rapid advancement of AI technologies like Large Language Models (LLM), the use and security of personal data have become crucial issues. Currently, most user data is stored on centralized servers, and users do not have sufficient control over their data, such as access or deletion. As a result, there are very few platforms where users can manage and trade their data. Moreover, there is a lack of platforms where data from various services, including specific certificates or medical data, can be traded in a trustworthy manner ensuring the source and integrity.
Newnal Data Market is a platform where personal data can be traded freely in a decentralized and verifiable manner. Users can store verifiable data (Verifiable Credential) received from Data Providers in their Personal Web Node (PWN) and sell this data as a Data Package (Verifiable Presentation). Data buyers can purchase data with guaranteed source and integrity. Additionally, all financial flows through transactions are conducted on-chain, ensuring transparency.
Participants
Buyer
The entity purchasing the data.
Seller(Owner)
The entity operating the PWN and selling the data. The data stored in the PWN is packaged as a Data Package and submitted to the Verifier.
Agency
Periodically queries the Index Server for data sellers matching the transaction conditions on behalf of the Seller or Buyer and sends Data Package submission alerts.
Verifier
The trusted data transaction validator designated by the Buyer, who verifies the Data Package submitted by the Seller. Identified by DID.
PWN(Personal Web Node)
A personal web node operated by the Data Owner. Personal data issued by the Data Provider is stored in the PWN, and the index information of the data is stored in the Index Server.
Data Provider(Issuer)
Various services that issue data in a verifiable form (VC) to users. (e.g., Netflix, Instagram, Google)
PWN Index Server
A server that indexes identification information and data categories of DID received from the PWN and provides a list of DIDs that meet specific conditions.
Data Market Service
Handles authentication and various functions for buyers, sellers, and agents, and interaction between the Data Market Chain.
Data Market Chain
A blockchain (parachain) containing the service logic of the Data Market. Consists of smart contracts for data purchase registration and transaction, agency registration, and delegation.
DID Chain
A blockchain (parachain) containing the service logic related to DID. Consists of smart contracts for DID registration and revocation, document query, and modification.
Main Components
Data Package
A data-package is a zip file composed of personal data arranged by the Owner according to the data transaction conditions, consisting of VC and File, verifying the source and integrity of the data.
The source of the data can be legally verified through VC signature verification.
If the data issued with the VC includes a File, the source and integrity of the File can be verified through the digest credentialSubject of the VC (*.json) with the same name as the File.
The index.json, which shows the file structure of the data-package, is a VC signed by the Owner and is similar to a file system.
"_dataScope": The data range specified by the buyer when purchasing data. The Owner must compose the data package with personal data matching this scope.
“_type”: Indicates the type of the file. VC (Verifiable Credential), Bin (file), Dir (folder).
For Bin (file) type, there must be a VC with the same name as the file, and the VC must have digest and originalFilename as credentialSubject.
Creation and Delivery of Data Package
The Owner creates the data-package according to the dataScope (graphQL query) received from the index server or agency.
The created data package is submitted to the verifier in the following form:
data-package (ZIP) ⇒ buffer ⇒ base64 encoding ⇒ encryption (JWE).
The Verifier obtains the data-package through the reverse process and verifies the VC, file checksum, and data scope.
Verifier
The Verifier is the entity that verifies the data-package received from the Owner and executes the trade settlement (execute_trade) transaction upon successful verification. The Buyer can designate a trusted Verifier for the purchase, or the Buyer can operate the Verifier Server directly.
When the Owner calls the submit-data API, the Verifier performs three main tasks.
submit_data
API: API for the Owner to submit the data-package to the Verifier.
Input Param
Purchase ID
Seller DID
JWE: Encrypted Data Package
JWE
.
Data Signing and Encryption (ECDH-1PU)
Data encryption between Owner and Verifier through ECDH-1PU.
verify_data
: Decrypts and verifies the encrypted data-package (Verifier internal method).
Data Decryption and Verification
Decrypts to obtain the data-package.
DID authentication for the Owner during the decryption process (ECDH-1PU).
Verifies the data-package.
Verifies the authenticity and data scope of the decrypted data-package.
Authenticity Verification: Verifies the signature of the VC in index.json and checks the integrity of the file data through the digest field of the VC.
execute_trade
: Sends a transaction to execute the trade if the data-package verification is successful.
If verification is successful, executes the
execute_trade
transaction on the blockchain for trade settlement.Transaction parameters:
Purchase ID (contractId).
Owner's DID public key (ownerDIDPk).
List of data issuers (dataIssuers: issuer DID and validity period of each VC).
Proof for data integrity verification (SHA256 hash of the decryptedMessage).
store_data
: API to provide the data-package URL to the Buyer.
Uploads the verified data-package to S3 and provides the URL to the Buyer.
Grants authorization so that only the Buyer can call the API.
Index Server
The Index Server is an important component that manages the index of data owned by each PWN (Personal Web Node). Through this, participants in the data market can easily identify what data is held by whom, enabling efficient data transactions.
Architecture
The Index Server consists of the following components:
PWN (Personal Web Node): Stores and manages the index of data owned by each individual.
Index Server: Stores the index sent from the PWN in ElasticSearch and searches the index according to user requests.
ElasticSearch: Stores index data and supports fast searches.
PWN uses the unique Query Language of the Index Server to store its index or query the list of DIDs holding specific index values.
Index Server Query Language
It primarily uses GraphQL syntax, but there are differences in how each field is used compared to GraphQL.
Data With Value
is generally composed of key-value pairs. The value can be defined regardless of type, such as String
, Number
, or Object
.
Additionally, for the Number
type, it is possible to specify a range without specifying a specific value using gt
, gte
, lt
, lte
.
Data Without Value
is composed of key values. For fields that do not disclose the actual value, it is composed of the key values of those fields
.
Store Data
When storing data in the Index Server, the Index Server Query Language sent to the Index Server is converted and stored in a format that can be stored in ElasticSearch.
For example, if the request is made as shown in the picture above, it is stored in ElasticSearch as shown below.
The part corresponding to the Data Type in the Index Server Query Language corresponds to key
, and the part corresponding to Data With Value is included in the index
field. The part corresponding to Data Without Value is included as field names in the Array value within the hadData
field in the index
field.
Query Data
The Index Server can query DIDs that match the conditions of the Index Server Query. In this case, it converts and uses the Index Server Query Language into DSL (Domain Specific Language) format to be used in ElasticSearch.
The following code is an example implemented in TypeScript.
By obtaining the DID that matches the given conditions, the data buyer can request a data purchase, or the agency can notify the data owner to sell the data.
Data Market Smart Contract
Overview
Newnal Data Market is a blockchain-based data market platform that ensures decentralization and transaction transparency using smart contracts. Data buyers and sellers can conduct transactions in a P2P manner and also support transactions through agencies. In any case, a signed data transaction contract between the parties is required for the transaction to proceed, and the contract is managed through the blockchain.
Once an agreement on the data transaction contract between the parties is reached, the data is transferred from the seller to the buyer off-chain. There is a data verifier who checks the validity of the data, and the payment for the data transaction is made through a blockchain transaction. The blockchain clearly specifies the data market fee and the amount for each data transaction, and anyone can verify this information.
In the case of the agency model that sells data on behalf of the seller, the agency is managed through the blockchain. The role of the agency on the platform is to act as an intermediary by informing the buyer of the indexing information of the seller's data and receives rewards when the data transaction is completed. All information about the agency, including registration and cancellation, is managed through the blockchain.
Contracts
Newnal Data Market facilitates data transactions based on agreements between data buyers and sellers. Agreements are managed through on-chain contracts, primarily divided into Purchase and Delegate contracts.
Purchase
The purchase contract is established between the data buyer and seller (if there is no agency) or between the seller and agency (if there is an agency).
If there is no agency: When the data seller and buyer trade directly, the buyer first creates a purchase contract through a transaction, expressing their intent to purchase. The data owner (seller) then submits the data specified in the
data_purchase_info
in the contract.If there is an agency: Typically, the data owner (seller) trades through an agency instead of directly with the buyer. The data owner (seller) designates a trusted agency to which they can delegate their data through the DelegateContract. The data buyer then enters into a contract with the agency through the PurchaseContract regarding the data they wish to purchase.
In any case, a transaction signature between the trading parties is required for the contract to be valid. The contract is managed by being mapped to storage with ContractId
as the key.
Components
data_buyer
Data buyer account
data_verifier
Data verifier account designated by the data buyer
effective_at
The time when the contract becomes active
expired_at
The time when the contract expires
data_purchase_info
Data purchase range
system_token_id
Payment method for data transactions
agency
Agency account, must be specified for data transactions involving an agency
price_per_data
Price per data unit
deposit
Deposit by the data buyer
trade_count
Number of data transactions
data_trade_record
Records of data transactions to prevent duplicate transactions and save transaction accounts
signed_status
Signature status between the trading parties (seller and buyer or agency and buyer)
Delegate
A delegation contract is written when there is an agency involved in the data transaction, between the seller and the agency or the buyer and the agency.
Components
data_owner
Data seller or owner account
agency
Agency account to which the data owner delegates their data index information
data_owner_minimum_fee_ratio
Minimum profit distribution ratio for the data owner
effective_at
The time when the contract becomes active
expired_at
The time when the contract expires
signed_status
Signature status between the trading parties (owner and agency)
Execute
Once an agreement is reached on the Purchase or Delegate contract, the seller transfers the data to the buyer off-chain, and the data verifier specified by the buyer validates the data. The payment for the data transaction is made through a blockchain transaction according to the conditions specified in the contract, and all records can be verified through the chain. Most data validity checks (e.g., schema verification, data sales range) are performed by the Verifier, while the smart contract itself verifies the validity of the data transaction contract.
Validity
Verify the presence of the Verifier in the contract: Actual data transactions must be conducted through the Verifier.
Verify the validity period of the contract: The data transaction must occur after the effective_at and before the expired_at specified in the contract.
Verify agency match: In the case of an agency model, verify that the agency specified in the data contract matches.
Verify transaction count in the contract: Data transactions can only occur up to the maximum specified value.
Prevent duplicate transactions: In the case of a non-agency model, the data seller can only sell once.
Profit distribution ratio: The platform cannot set a higher ratio than the overall ratio determined by the data market platform policy.
Distribution
Once all validity checks pass, profit distribution is performed. In Newnal Data Market, profits from sold data are distributed to the seller, data provider, platform, and in the case of an agency model, to the agency as well. Each ratio is divided according to the ratio specified in the smart contract, and all records can be verified on the blockchain. In the case of data providers, profits are distributed based on their contribution to data supply.
Proof
The Verifier receives the data off-chain and includes proof (e.g., hash) of the data in the transaction. This proof is generated as a blockchain event, which does not occupy storage space but leaves a record.
Agency
Data buyers and sellers can conduct transactions through an agency instead of trading directly. Generally, data sellers find it difficult to continuously monitor data buyers, and the agency performs this role on their behalf. Agencies can operate after registering on-chain. Various information about the agency can also be verified through the blockchain. Additionally, if the agency no longer wishes to operate, they can cancel their registration through a transaction.
Configuration
Defines the profit distribution ratio settings for the data market platform.
Components
total_fee_ratio
Total profit distribution ratio
min_platform_fee_ratio
Minimum platform profit distribution ratio
Main processes
Agency Registration
The user executes a transaction including metadata (identity information, transaction fees, etc.) on the Datamarket Chain to perform the role of an Agency.
Data Purchase w/ Agency
(Buyer) Register data purchase
Send
register_data_purchase
transaction to Data Market ChainSelect desired agency from the agency list
(Agency) Sign purchase contract
If the Agency agrees to the data purchase delegation created in step 1, send a signature transaction
(Agency) Query DID list of data sellers matching purchase conditions
Periodically check with Index Service if there are data sellers matching the purchase conditions
(Agency) Send data sales request alerts to seller DID list
(Seller) Submit data to Verifier
Configure Data package matching the purchase conditions and submit data to Verifier
(Verifier) Verify Data package
Verify Data package
(Verifier) Execute trade
If verification is successful, send
execute_trade
transaction to Market Chain for transaction settlement
(Buyer) Query purchased data
Query data through the URL provided by the Verifier
Data Purchase w/o Agency
(Buyer) Register data purchase
Send
register_data_purchase
transaction to Data Market Chain
(Seller) Submit data to Verifier
Configure Data package matching the purchase conditions and submit data to Verifier
(Verifier) Verify Data package
Verify Data package
(Verifier) Execute trade
If verification is successful, send
execute_trade
transaction to Market Chain for transaction settlement
(Buyer) Query purchased data
Query data through the URL provided by the Verifier
Data Sale w/ Agency
(Seller) Sign delegate contract
If the data seller wants to sell their data automatically, select an Agency on the Chain and send
sign_delegate_contract
transaction
(Agency) Make delegate contract
If the Agency agrees to the data sales delegation from the seller in step 1, send a
make_delegate_contract
transaction to sign
(Seller) Store data received from Data Provider in PWN
(Seller) Store the index of the data in the Index Server
(Agency) Periodically query data purchase cases matching the contracted seller's data
When data available for sale is found, send data sale request alert to seller DID
Last updated