Add Merkletools package for bigfile plugin

This commit is contained in:
shortcutme 2017-10-04 13:29:32 +02:00
parent cf1154f2c5
commit 8e2be5cfe2
No known key found for this signature in database
GPG key ID: 5B63BAE6CB9613AE
4 changed files with 366 additions and 0 deletions

View file

@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2016 Tierion
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View file

@ -0,0 +1,178 @@
# pymerkletools
[![PyPI version](https://badge.fury.io/py/merkletools.svg)](https://badge.fury.io/py/merkletools) [![Build Status](https://travis-ci.org/Tierion/pymerkletools.svg?branch=master)](https://travis-ci.org/Tierion/pymerkletools)
This is a Python port of [merkle-tools](https://github.com/tierion/merkle-tools).
Tools for creating Merkle trees, generating merkle proofs, and verification of merkle proofs.
## Installation
```
pip install merkletools
```
### Create MerkleTools Object
```python
import merkletools
mt = MerkleTools(hash_type="md5") # default is sha256
# valid hashTypes include all crypto hash algorithms
# such as 'MD5', 'SHA1', 'SHA224', 'SHA256', 'SHA384', 'SHA512'
# as well as the SHA3 family of algorithms
# including 'SHA3-224', 'SHA3-256', 'SHA3-384', and 'SHA3-512'
```
To use `sha3`, this module depends on [pysha3](https://pypi.python.org/pypi/pysha3). It will be installed as part of this module or you can install it manually with :
```bash
pip install pysha3==1.0b1
```
## Methods
### add_leaf(value, do_hash)
Adds a value as a leaf or a list of leafs to the tree. The value must be a hex string, otherwise set the optional `do_hash` to true to have your value hashed prior to being added to the tree.
```python
hex_data = '05ae04314577b2783b4be98211d1b72476c59e9c413cfb2afa2f0c68e0d93911'
list_data = ['Some text data', 'perhaps']
mt.add_leaf(hexData)
mt.add_leaf(otherData, True)
```
### get_leaf_count()
Returns the number of leaves that are currently added to the tree.
```python
leaf_count = mt.get_leaf_count();
```
### get_leaf(index)
Returns the value of the leaf at the given index as a hex string.
```python
leaf_value = mt.get_leaf(1)
```
### reset_tree()
Removes all the leaves from the tree, prepararing to to begin creating a new tree.
```python
mt.reset_tree()
```
### make_tree()
Generates the merkle tree using the leaves that have been added.
```python
mt.make_tree();
```
### is_ready
`.is_ready` is a boolean property indicating if the tree is built and ready to supply its root and proofs. The `is_ready` state is `True` only after calling 'make_tree()'. Adding leaves or resetting the tree will change the ready state to False.
```python
is_ready = mt.is_ready
```
### get_merkle_root()
Returns the merkle root of the tree as a hex string. If the tree is not ready, `None` is returned.
```python
root_value = mt.get_merkle_root();
```
### get_proof(index)
Returns the proof as an array of hash objects for the leaf at the given index. If the tree is not ready or no leaf exists at the given index, null is returned.
```python
proof = mt.get_proof(1)
```
The proof array contains a set of merkle sibling objects. Each object contains the sibling hash, with the key value of either right or left. The right or left value tells you where that sibling was in relation to the current hash being evaluated. This information is needed for proof validation, as explained in the following section.
### validate_proof(proof, target_hash, merkle_root)
Returns a boolean indicating whether or not the proof is valid and correctly connects the `target_hash` to the `merkle_root`. `proof` is a proof array as supplied by the `get_proof` method. The `target_hash` and `merkle_root` parameters must be a hex strings.
```python
proof = [
{ right: '09096dbc49b7909917e13b795ebf289ace50b870440f10424af8845fb7761ea5' },
{ right: 'ed2456914e48c1e17b7bd922177291ef8b7f553edf1b1f66b6fc1a076524b22f' },
{ left: 'eac53dde9661daf47a428efea28c81a021c06d64f98eeabbdcff442d992153a8' },
]
target_hash = '36e0fd847d927d68475f32a94efff30812ee3ce87c7752973f4dd7476aa2e97e'
merkle_root = 'b8b1f39aa2e3fc2dde37f3df04e829f514fb98369b522bfb35c663befa896766'
is_valid = mt.validate_proof(proof, targetHash, merkleRoot)
```
The proof process uses all the proof objects in the array to attempt to prove a relationship between the `target_hash` and the `merkle_root` values. The steps to validate a proof are:
1. Concatenate `target_hash` and the first hash in the proof array. The right or left designation specifies which side of the concatenation that the proof hash value should be on.
2. Hash the resulting value.
3. Concatenate the resulting hash with the next hash in the proof array, using the same left and right rules.
4. Hash that value and continue the process until youve gone through each item in the proof array.
5. The final hash value should equal the `merkle_root` value if the proof is valid, otherwise the proof is invalid.
## Common Usage
### Creating a tree and generating the proofs
```python
mt = MerkleTools()
mt.add_leaf("tierion", True)
mt.add_leaf(["bitcoin", "blockchain"], True)
mt.make_tree()
print "root:", mt.get_merkle_root() # root: '765f15d171871b00034ee55e48ffdf76afbc44ed0bcff5c82f31351d333c2ed1'
print mt.get_proof(1) # [{left: '2da7240f6c88536be72abe9f04e454c6478ee29709fc3729ddfb942f804fbf08'},
# {right: 'ef7797e13d3a75526946a3bcf00daec9fc9c9c4d51ddc7cc5df888f74dd434d1'}]
print mt.validate_proof(mt.get_proof(1), mt.get_leaf(1), mt.get_merkle_root()) # True
```
## Notes
### About tree generation
1. Internally, leaves are stored as `bytearray`. When the tree is build, it is generated by hashing together the `bytearray` values.
2. Lonely leaf nodes are promoted to the next level up, as depicted below.
ROOT=Hash(H+E)
/ \
/ \
H=Hash(F+G) E
/ \ \
/ \ \
F=Hash(A+B) G=Hash(C+D) E
/ \ / \ \
/ \ / \ \
A B C D E
### Development
This module uses Python's `hashlib` for hashing. Inside a `MerkleTools` object all
hashes are stored as Python `bytearray`. This way hashes can be concatenated simply with `+` and the result
used as input for the hash function. But for
simplicity and easy to use `MerkleTools` methods expect that both input and outputs are hex
strings. We can convert from one type to the other using default Python string methods.
For example:
```python
hash = hashlib.sha256('a').digest() # '\xca\x97\x81\x12\xca\x1b\xbd\xca\xfa\xc21\xb3\x9a#\xdcM\xa7\x86\xef\xf8\x14|Nr\xb9\x80w\x85\xaf\xeeH\xbb'
hex_string = hash.decode('hex') # 'ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb'
back_to_hash = hash_string.decode('hex') # '\xca\x97\x81\x12\xca\x1b\xbd\xca\xfa\xc21\xb3\x9a#\xdcM\xa7\x86\xef\xf8\x14|Nr\xb9\x80w\x85\xaf\xeeH\xbb'
```

View file

@ -0,0 +1,138 @@
import hashlib
import binascii
class MerkleTools(object):
def __init__(self, hash_type="sha256"):
hash_type = hash_type.lower()
if hash_type == 'sha256':
self.hash_function = hashlib.sha256
elif hash_type == 'md5':
self.hash_function = hashlib.md5
elif hash_type == 'sha224':
self.hash_function = hashlib.sha224
elif hash_type == 'sha384':
self.hash_function = hashlib.sha384
elif hash_type == 'sha512':
self.hash_function = hashlib.sha512
elif hash_type == 'sha3_256':
self.hash_function = hashlib.sha3_256
elif hash_type == 'sha3_224':
self.hash_function = hashlib.sha3_224
elif hash_type == 'sha3_384':
self.hash_function = hashlib.sha3_384
elif hash_type == 'sha3_512':
self.hash_function = hashlib.sha3_512
else:
raise Exception('`hash_type` {} nor supported'.format(hash_type))
self.reset_tree()
def _to_hex(self, x):
try: # python3
return x.hex()
except: # python2
return binascii.hexlify(x)
def reset_tree(self):
self.leaves = list()
self.levels = None
self.is_ready = False
def add_leaf(self, values, do_hash=False):
self.is_ready = False
# check if single leaf
if isinstance(values, tuple) or isinstance(values, list):
for v in values:
if do_hash:
v = v.encode('utf-8')
v = self.hash_function(v).hexdigest()
v = bytearray.fromhex(v)
else:
v = bytearray.fromhex(v)
self.leaves.append(v)
else:
if do_hash:
v = values.encode("utf-8")
v = self.hash_function(v).hexdigest()
v = bytearray.fromhex(v)
else:
v = bytearray.fromhex(values)
self.leaves.append(v)
def get_leaf(self, index):
return self._to_hex(self.leaves[index])
def get_leaf_count(self):
return len(self.leaves)
def get_tree_ready_state(self):
return self.is_ready
def _calculate_next_level(self):
solo_leave = None
N = len(self.levels[0]) # number of leaves on the level
if N % 2 == 1: # if odd number of leaves on the level
solo_leave = self.levels[0][-1]
N -= 1
new_level = []
for l, r in zip(self.levels[0][0:N:2], self.levels[0][1:N:2]):
new_level.append(self.hash_function(l+r).digest())
if solo_leave is not None:
new_level.append(solo_leave)
self.levels = [new_level, ] + self.levels # prepend new level
def make_tree(self):
self.is_ready = False
if self.get_leaf_count() > 0:
self.levels = [self.leaves, ]
while len(self.levels[0]) > 1:
self._calculate_next_level()
self.is_ready = True
def get_merkle_root(self):
if self.is_ready:
if self.levels is not None:
return self._to_hex(self.levels[0][0])
else:
return None
else:
return None
def get_proof(self, index):
if self.levels is None:
return None
elif not self.is_ready or index > len(self.leaves)-1 or index < 0:
return None
else:
proof = []
for x in range(len(self.levels) - 1, 0, -1):
level_len = len(self.levels[x])
if (index == level_len - 1) and (level_len % 2 == 1): # skip if this is an odd end node
index = int(index / 2.)
continue
is_right_node = index % 2
sibling_index = index - 1 if is_right_node else index + 1
sibling_pos = "left" if is_right_node else "right"
sibling_value = self._to_hex(self.levels[x][sibling_index])
proof.append({sibling_pos: sibling_value})
index = int(index / 2.)
return proof
def validate_proof(self, proof, target_hash, merkle_root):
merkle_root = bytearray.fromhex(merkle_root)
target_hash = bytearray.fromhex(target_hash)
if len(proof) == 0:
return target_hash == merkle_root
else:
proof_hash = target_hash
for p in proof:
try:
# the sibling is a left node
sibling = bytearray.fromhex(p['left'])
proof_hash = self.hash_function(sibling + proof_hash).digest()
except:
# the sibling is a right node
sibling = bytearray.fromhex(p['right'])
proof_hash = self.hash_function(proof_hash + sibling).digest()
return proof_hash == merkle_root

View file

@ -0,0 +1,29 @@
import os
from setuptools import find_packages
from setuptools import setup
here = os.path.abspath(os.path.dirname(__file__))
install_requires = [
"pysha3==1.0b1"
]
setup(
name='merkletools',
version='1.0.2',
description='Merkle Tools',
classifiers=[
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 2.7",
],
url='https://github.com/',
author='Eder Santana',
keywords='merkle tree, blockchain, tierion',
license="MIT",
packages=find_packages(),
include_package_data=False,
zip_safe=False,
install_requires=install_requires
)