Fast Bitcoin Data Mining
Fast Bitcoin Data Mining
Tutorial
(used in Cryptanalysis COMPGA18 and Applied Cryptography GA12, part of UCL M.Sc. Information Security)
Introduction
This document is aimed to help anyone in mining the Bitcoin blockchain for
interesting events at state-of-the-art speed. We assume very little prior
knowledge about programming. This tutorial is separated into 3 parts:
Part 1 demonstrates how to setup and run the BitcoinDatabaseGenerator
project that will parse the blockchain files and create an SQL database.
Part 2 shows the modifications necessary to add additional fields to the
database.
Part 3 is a tutorial on how to extract random numbers used more than once
for the elliptic curve digital signature in the Bitcoin blockchain using the
previously created database.
Part 1
The following components are necessary and we provide step by step instructions.
1. Bitcoin core (free), also tested to work with Bitcoin XT
2. Microsoft SQL EXPRESS 2008/2012/2014 (free but with some issues, we
strongly recommend a full version, see section Limitations).
3. A free tool called BitcoinDatabaseGenerator compiled with Visual Studio
2013. We will later add functionality to this tool.
4. Visual Studio 2008 or higher to access the data at very high speed in
C/C++.
[optional]
Emergency solution for lack of space: it is also possibly AT RUNTIME or later to
stop windows SQL service, rename DATA to DATA2 (for now), re-create DATA
directory and map this exact empty NTFS directory with Disk Management to a
new partition with lots of space, then move the DATA2 there, and restart the
windows service (“Start”).
One method is to use Visual Studio. In Visual Studio 2010 [or another version]:
• Go to View => Other windows => Server Explorer pane (Ctrl+Alt+S), place
conveniently
• Under Data Connection right click "Create New SQL Server Database"
in Server Name put PC-NAME \SQLEXPRESSBIG below,
• your PC-NAME can be found under
My Computer -> right click -> properties
• Under New Database name put MyBitcoinData, click OK
Remark:
MOVING FILES: these database files CAN be moved (preferably together) to
another computer, just stop the SQL service for SQLEXPRESSBIG in My
Computer=>management=>services, copy files and Start it again!
Warning: erasing the database called “master” will prevent SQLEXPRESS from
starting again, need to uninstall and re-install carefully (slow process).
1.5 Step 5: Run BitcoinDatabaseGenerator.exe
In windows search type “cmd”,
run cmd.exe then type:
D:
CD d:\work
BitcoinDatabaseGenerator.exe /BlockchainPath
X:\active_live_blockchain\satoshi110\blocks /SqlServerName PC-
NAME\SQLEXPRESSBIG /SqlDbName MyBitcoinData
Once this process is complete you can browse the data inside the database with
tools like SQL Server Management Studio that comes for free with SQL Server.
This part is a step-by-step tutorial on how to extract random numbers used more
than once for the elliptic curve digital signature in the Bitcoin blockchain.
3.1 Introduction
Randomly choose the same 32-byte long number more than once should happen
only with negligible probability but due to various reasons this phenomenon can
be observed in the public Bitcoin ledger. This can lead to different types of attacks
as described by Courtois [1] where some or all users can steal the contents of a
wallet that used one of those “bad randoms”. This makes the discovery of those
numbers and the study of the way they appear a worthwhile goal.
3.2 Prerequisites
In order to recover all reused random numbers, the complete Bitcoin blockchain
is needed in a database. To do this follow the instructions found in Part 1 and 2.
Once you have completed those steps proceed to the next section.
[1] Courtois, N.T., Valsorda, F. and Emirdag, P., 2014. Private Key Recovery Combination Attacks: On Extreme Fragility of
Popular Bitcoin Key Management, Wallet and Cold Storage Solutions in Presence of Poor RNG Events.
In order to create the two tables mentioned, you will need Microsoft SQL Server
Management Studio that should have been installed together with SQL Server, if
not, download and install the corresponding version for your database, this
software is free.
The next step is to download the RandomsToTables.sql script from here
https://github.com/JasonPap/Reused-Bitcoin-
Numbers/blob/master/SQL/RandomsToTables.sql
and open it with SQL Server Management Studio and execute it (F5), this will take
up to 3 hours depending on the hardware and the load of the machine. Once the
query is completed you can look at the two new tables.
3.4 Generating webpage with reused random numbers
The final step of this tutorial is about how to generate a webpage that contains the
random numbers used more than once together with the transactions in which
they appeared.
1. Download and install Python 2.7
2. Next you will need pymssql, a Python module that makes the connection to
SQL Server very easy.
a. Download it from here and place the file into C:/Python27
https://www.lfd.uci.edu/~gohlke/pythonlibs/#pymssql
b. Now install it using pip command line, cd into C:/Python27 and run
the following: pip install pymssql-2.1.1-cp27-none-win_amd64.whl
After those steps a file named index.html should appear on the same directory as
the python script. Open it with any web browser. You can download the final result
from here based on data collected the 07/03/2016.
https://github.com/JasonPap/Reused-Bitcoin-
Numbers/blob/master/Python/index.html