Anonymous Bitcoin Transactions: Felix Maduakor
Anonymous Bitcoin Transactions: Felix Maduakor
Anonymous Bitcoin Transactions: Felix Maduakor
Felix Maduakor
Eidesstattliche Erklärung
Ich erkläre, dass ich keine Arbeit in gleicher oder ähnlicher Fassung bereits für eine
andere Prüfung an der Ruhr-Universität Bochum oder einer anderen Hochschule
eingereicht habe.
Ich versichere, dass ich diese Arbeit selbstständig verfasst und keine anderen als die
angegebenen Quellen benutzt habe. Die Stellen, die anderen Quellen dem Wortlaut
oder dem Sinn nach entnommen sind, habe ich unter Angabe der Quellen kenntlich
gemacht. Dies gilt sinngemäß auch für verwendete Zeichnungen, Skizzen, bildliche
Darstellungen und dergleichen.
Ich versichere auch, dass die von mir eingereichte schriftliche Version mit der digita-
len Version übereinstimmt. Ich erkläre mich damit einverstanden, dass die digitale
Version dieser Arbeit zwecks Plagiatsprüfung verwendet wird.
Date author
Contents
Glossary ix
Acronyms 1
1 Introduction 3
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Organization of this Thesis . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 5
2.1 Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2.1 P2PKH and P2SH transactions . . . . . . . . . . . . 7
2.1.2.2 Multisignature transactions . . . . . . . . . . . . . . 8
2.1.2.3 Replace-By-Fee . . . . . . . . . . . . . . . . . . . . . 8
2.1.2.4 Locktime, sequence numbers and version . . . . . . 9
2.1.2.5 Transaction fee . . . . . . . . . . . . . . . . . . . . . 10
2.1.2.6 Transaction time and IP addresses . . . . . . . . . . 10
2.1.2.7 Example transaction . . . . . . . . . . . . . . . . . . 11
2.2 Fungibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Privacy in Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Mixing techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Decentralized mixing (P2P mixing) . . . . . . . . . . . . . . . 18
2.4.2 Centralized Mixing Services (CMS) . . . . . . . . . . . . . . . 19
2.4.3 Off chain mixing . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Attack on coinmixer.se 29
4.1 Functionality of coinmixer.se . . . . . . . . . . . . . . . . . . . . . . 29
4.1.1 Optional setting: Multiple addresses . . . . . . . . . . . . . . 29
4.1.2 Optional setting: Time delay . . . . . . . . . . . . . . . . . . 30
4.1.3 Mixing fee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Attacker Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Attacking Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.1 Steps to break coinmixer.se . . . . . . . . . . . . . . . . . . . 31
4.4 Identifying coinmixer.se’s network . . . . . . . . . . . . . . . . . . . 32
4.4.1 Characteristics of customer’s input transactions . . . . . . . . 33
4.4.2 Characteristics of coinmixer’s output transactions . . . . . . 33
4.4.3 Identifying customer’s and coinmixer’s transactions . . . . . . 37
4.5 Crawler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5.1 Gathering blockchain data . . . . . . . . . . . . . . . . . . . . 42
4.5.2 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5.3 Forward crawling . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5.4 Backward crawling . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5.5 Incorrect transaction distinguishing . . . . . . . . . . . . . . . 49
4.6 Deanonymization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Conclusion 59
5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
List of Figures 63
List of Tables 65
List of Listings 66
Bibliography 67
A Database structure 71
B Python Code 77
Contents vii
Glossary
Block Generation Time Average time required till a new block is found.
Blockchain Difficulty The required time to mine a Bitcoin block is based on a dy-
namically calculated blockchain difficulty.
Letter Of Guarantee A letter signed by the coinmixer, which provides the input
and output addresses of the mixing process.
Wallet/Client Software A software which can be used to create, receive and send
Bitcoin transactions.
Acronyms
RBF Replace-By-Fee.
RPC Remote Procedure Call.
Sat Satoshi.
Seq Sequence Number.
TX Transaction.
At the time of writing, Bitcoin is the leading P2P cryptocurrency [4]. It’s based
on a decentralized P2P structure and was introduced through Satoshi Nakamoto in
2008. It was build to be the digital form of cash. Through Bitcoin it is possible to
transfer assets without an intermediary. [37]
However, through the implementation of the Bitcoin blockchain it is possible to trace
every transaction. Bitcoin does not offer strong privacy guarantees [42]. If Bitcoin
gets widely adopted as a payment method, there may be the need to enhance the
privacy of the network.
Recently multiple P2P algorithms have been published, which aim to enhance pri-
vacy in the Bitcoin network [44, 40, 42, 30]. Tim Ruffing, Pedro Moreno-Sanchez
and Aniket Kate introduced coinshuffle ++ as a decentralized Bitcoin mixing pro-
tocol which is fully compatible with the current Bitcoin system [40]. However, none
of the mentioned privacy enhancing algorithmns seems to be widely adopted.
Bitcoin Core, as the reference client of Bitcoin, does not implement any mixing pro-
tocol [3].
Another way to enhance privacy of Bitcoin transactions is based on commercially
driven centralized mixing services. In this theses we are going to focus on central-
ized mixing services. We are going to discuss their (dis-)adventages and present
possible attacking approaches. Furthermore, we are going to implement an attack
on coinmixer.se, a frequently used centralized mixing service. Our aim is to imple-
ment an attack which is able to deanonymize transactions, which priorly have been
anonymized by coinmixer.se. We are going to discuss a general attacking approach
which could be able to successfully attack most of the commonly used centralized
Bitcoin mixing services.
1.1 Motivation
Privacy is an important aspect of cryptocurrencies. There have been created several
cryptocurrencies which aim to enhance the privacy of transactions like Monero or
Zerocash. Based on a zero-knowledge proof, the cryptocurrency Zerocash is able to
offer strong privacy guarantees [42].
While implementations of P2P mixing algorithmns in Bitcoin clients could enhance
privacy in Bitcoin, they have not been adopted widely yet. Centralized Bitcoin mix-
ing service are often recommended to be used to enhance privacy [9].
4 1 Introduction
However, we want to show that there may be a general attacking approach which
could lead to the deanonymization of most transactions, which were processed by
centralized mixing services.
We are going to show, that even though the internals of a mixing service could
be based on a secure mixing algorithm, the implementation of this algorithm as a
centralized service could easily lead to vulnerabilities. Even a small information
leak could lead to the deanonymization of every transaction which the service ever
processed. Since all Bitcoin transactions are based on the blockchain, an identi-
fied information leak could deanonymize transaction which were processed years
ago.
1.2 Contribution
While we attack coinmixer.se as a centralized mixing service, we were able to show
the general problems of centralized mixing services. We implemented a tool which
is able to deanonymize transactions which priorly have been anonymized by coin-
mixer.se. While our implementation is based on coimixer.se, it easily easily can be
adopted to work with other centralized mixing services. We were able to create
and implement an attacking approach which is able to break nearly all known cen-
tralized mixing services. Furthermore, it may be even applied to cryptocurrency
networks.
In this chapter we will give some technical background information about the basic
concepts behind Bitcoin. We will discuss the importance of privacy in Bitcoin and
different approaches which have been developed to enhance privacy in the Bitcoin
ecosystem.
2.1 Bitcoin
Bitcoin is peer-to-peer electronic cash through which digital transactions can be sent
without the need to use an intermediary. While cash payments are typically made
directly between individuals, there haven’t been a digital way to make this kind of
transactions until Bitcoin solved this problem. [37]
The basic design idea of the Bitcoin network has been created by a developer with the
pseudonym Satoshi Nakamoto. He published the basic concepts through his whitepa-
per Bitcoin: A Peer-to-Peer Electronic Cash System in 2008. citenakamoto2008bitcoin
No specific thrid party of the Bitcoin network needs to be trusted. However, big
parts of the Bitcoin network and the client software should be trusted, otherwise
some specific attacking approaches may be possible [11]. This property has been
achieved through the sophisticated use of cryptographic models.
The Bitcoin protocol is entirely open source. There are several groups of developers
around Bitcoin. Unlike government issued currencies, changes in the Bitcoin pro-
tocol require consensus of broad parts of the Bitcoin ecosystem. The realization of
these changes is time consuming, since multiple parties have to be actively involved
in a successful protocol update. A dynamic monetary policy, which is performed on
government issued currencies, cannot be applied to Bitcoin [13]. As the first cryp-
tocurrency, Bitcoin created a new type of digital assets. In recent years, hundreds of
cryptocurrencies, which differ in specific properties to Bitcoin, have been developed
[4].
2.1.1 Blockchain
The Bitcoin blockchain is an implementation of a public ledger. Every transac-
tion that ever have been confirmed by the network is stored in this blockchain.
Through cryptographic primitives the blockchain is protected against modifications.
All transactions are publicly availbile, because they are permanantly stored in the
6 2 Background
The Bitcoin blockchain consists of several blocks that have been cryptographically
connected to each other. The header of a previous block has to be hashed and
stored into the next block header. A block consists of multiple transactions which
are hashed into a merkle tree. [9]
Bitcoin uses the SHA-256 hash function to cryptographically connect blocks in the
blockchain. Changes in the blockchain should not be able to be done without chang-
ing at least one hash value that results in corrupting this version of the blockchain.
A corrupted version of the blockchain will be rejected by the network. It should be
cryptographically hard to create two versions of the same block with different data
but the same hash. While this strict assumption holds in practice, it brings some
security issues.
Especially the transaction malleability bug made it possible to generate multiple
valid transactions with different transaction hashes, since even changing a single bit
of the hash functions’ input results in a new SHA-256 hash. It is likely that this bug
led to the loss of more than 302.000 Bitcoins. [19]
The transaction malleability bug has recently been fixed through the implementa-
tion of segwit. However, this bug can still be used in some specific circumstances.
[16]
The smallest unit of Bitcoin is named satoshi. One satoshi equals 0.00000001 BTC
(8 decimals).
2.1.2 Transactions
There are two different types of Bitcoin transactions. A Transaction, which is sent
to a miner who produced a new Bitcoin block, so called coinbase transaction, and
2.1 Bitcoin 7
typical Bitcoin transactions that are sent through a Bitcoin address. In the following
sections we are not going to discuss coinbase transactions, since they only can be
generated by miners. Some of the specific properties we are going to describe may
not apply to coinbase transactions.
As we have already seen in the previous section, transactions are a part of a Bitcoin
block. Every transaction consists of inputs and outputs. For every input of a
transaction, there has to be a previous output. An output can only be spent once.
Outputs that haven’t yet been used as an input are called unspent transaction output
(UTXO). In a general view, the transaction of Bitcoins equals the transfer of UTXO.
The amount of UTXO, which a Bitcoin address is able to spend, is the amount of
Bitcoins which is often called as the balance of a Bitcoin address. It is important
to know, that every transaction has to spend the full UTXO. So if a user is able to
spend 0.5 BTC though an UTXO, he has to spend the full 0.5 BTC. If he only wants
to send 0.30 BTC, he has to add another address which is controlled by himself to
be able to receive the change of 0.20 BTC.
In this figure 2.2, Alice sends Bitcoins to Bob. She adds his hashed public-key to
the transaction. If Bob wants to make use of the received Bitcoins, he has to prove
that he is the owner of the public-key.
In P2SH transactions the sender sends his coins to a script called redeem script. This
redeem script is provided through the receiver. Whenever the receiver tries to spend
UTXO which were sent to the P2SH address generated by him, the redeem script
8 2 Background
will get evaluated. Only if the redeem script evaluates to True the transactions’
UTXO can be spent by the receiver. While P2SH could be seen as general smart
contracts, most of the Bitcoin nodes will only accept standardized redeem scripts
[9].
Today P2SH transactions are mainly used to provide multisignature transactions.
In figure 2.3 the evaluation of the redeem script is based on multiple signatures.
In figure 2.3, Bob created a redeem script. Bob can choose on which conditions the
redeem script evaluates to True. Bob sends Alice the hash of the redeem script.
Alice sends the coins to the received hash. If Bob wants to spend the received coins,
the redeem script has to evaluate to True.
Based on the first letter of an address it can be distinguished if it’s a P2PKH address
or a P2SH address.
Table 2.4: Example of P2PKH and P2SH transaction hashes
It’s worth mentioning that there are two more types of Bitcoin transactions which
were introduced through the segwit soft-fork (P2WPKH and P2WSH) [29]. How-
ever, they are very similar to P2PKH and P2SH, since they are mainly focused on
moving the signature script to another location [9].
2.1.2.3 Replace-By-Fee
Opt-in Replace-By-Fee (RBF) is an optional feature which was introduced through
BIP0125 and had been implemented in Bitcoin Core nodes since version 0.12
[18].Through the opt-in RBF feature a transaction can be resent with a higher fee as
2.1 Bitcoin 9
long as it stays unconfirmed. The previously sent transaction will be ignored. This
feature can be useful if a transaction has been sent with a too low transaction fee
and might not get confirmed in the near future.
Opt-in RBF is activated whenever the sequence numbers of a transactions’ inputs
are set to a value lower than 0xfffffffe (4294967294) [18].
It’s worth mentioning that there are also further methods to replace unconfirmed
transactions like first-seen-safe Replace-By-Fee (fss RBF) and full Replace-By-Fee
(full RBF). Since all RBF methods are enforced through Bitcoin nodes, it does not
necessarily mean that miners have to realize them.
It’s important to state, that if the locktime features should be activated, the trans-
action’s version has to be set to at least 2 [23].
{ "result ": {
"txid": "81b46084e181eea9d846b2400e91a545178e61ca4a7730e9c0e3c15f7322a778",
"hash": "81b46084e181eea9d846b2400e91a545178e61ca4a7730e9c0e3c15f7322a778",
"size ": 517,
"vsize": 517,
"version": 2,
"locktime": 489733,
"vin": [
{
"txid": "b5a4d2d97d5b8dcfbccdb366dad557c507bdf6219bbd83eb7f9ce4f719995b20",
"vout": 0,
"scriptSig ": {"hex": "4730..."},
"sequence": 4294967294
},
{
"txid": "44b88be87299336ec02f94677ebaac6f63e37dbf21e38ba0829196937dc84a33",
"vout": 0,
"scriptSig ": {"hex": "473044022...},
"sequence": 4294967294
},
{
"txid": "e0c68b0b557700f26f65e27aea5ab4003e6a2a28af140db7c4508429d6b23873",
"vout": 1,
"scriptSig ": {"hex": "4730440..."},
"sequence": 4294967294
}
],
"vout": [
{
"value": 0.20000000,
"n": 0,
"scriptPubKey": {
"reqSigs": 1,
"type": "scripthash",
"addresses": [
"3H8tiRY6GfcTgjn2bDRjs9AwAgaT7KVE2P"
]
}
},
{
"value": 0.01359680,
"n": 1,
"scriptPubKey": {
"type": "pubkeyhash",
"addresses": [
"1E9bQsqFtf7SwTpZaiNvjq2d3BLNWo82ko"
]
}
}
],
"blockhash": "000000000000000000ec8d98e4ccdf87b35b21a980098d316f166442fe47c81c",
"confirmations": 3973,
"time": 1508008653,
"blocktime": 1508008653
},}
Some parameters have been truncated or removed. As we can see in listing 2.7,
the transaction has three inputs and two outputs specified. One of the outputs is a
P2PKH address and the other is a P2SH address.
Furthermore a locktime has been set. As already discussed, BIP0068 re-
quires the version to be set to 2 or greater whenever the locktime feature
is being used. As we can see, this requirement is met in this transac-
tion. Since the locktime is set to 489733, the first block which this trans-
action could be added to is 489734. In fact, this transaction was included
in block 000000000000000000ec8d98e4ccdf87b35b21a980098d316f166442fe47c81c at
the block height 489831, which is nearly 100 blocks later than the specified lock-
time. Furthermore the sequence numbers are set to 4294967294 (0xfffffffe), which
indicates that (opt-in) RBF is deactivated and a locktime could be set.
Interestingly, the transaction time and the block time are the same. However, if we
check the block time and timestamp in a blockchain explorer like blockchain.info we
see that they actually differ.
This is the case, because the timestamp of a transaction is not saved in the blockchain
data. The timestamps are logged by the node. However, our node was not connected
to the network when the transaction has been published, so the first time our node
notices this transaction is when parsing this block.
We are mainly focused on privacy aspects in Bitcoin, so we are not going to discuss
this simple raw transaction in further detail. However, it is important to see, that
there are many different features which can be used in Bitcoin transactions. Some
of them are used very rarely. Moreover, there are several ways to correctly sign a
Bitcoin transaction on a low-level view.
In the following chapters we will show, that the analysis of Bitcoin transactions can
lead us to identify implementations of generic transaction generation. Through this
we will be able to identify and break networks that are used to enhance privacy in
Bitcoin.
2.2 Fungibility
In an economical sense, a good is fungible if it’s interchangeable with other individual
goods of the same asset. Typically, government issued currencies and assets like gold
are fungible. [28]
For example the same amount of gold with the same weight and purity has normally
the same value. However, this does not necessarily apply to Bitcoin.
Through the usage of Bitcoin in criminal activity a blacklist of Bitcoin addresses is
heavily discussed and has already been implemented by multiple service providers
[36]. Since every Bitcoin transaction is publicly accessible, a blacklist can be easily
enforced. There are multiple suggestions and services that believe to enhance privacy
in Bitcoin and transform it into a fungible asset. We are going to discuss them in
the following sections.
2.3 Privacy in Bitcoin 13
This method was implemented to provide better privacy. In contrast to the model
describe above, where all transactions are sent and received by a single address, it
definitely enhances the privacy of the Bitcoin ecosystem. However, it’s still easy
14 2 Background
to spot which public-keys belong to a unique user which we will show through the
next examples. In figure 2.8 we are going to show a simplified Bitcoin transac-
tion. None of the addresses used as input will never be used again to receive any
UTXO.
Figure 2.8 is simplified, since the sum of both inputs exactly match the output
amount (0.1+0.2 = 0.3) and no fee has been applied. It is important to know, that al-
ways the whole amount of an input is spent. The sender would always have to pay the
full 0.30 BTC, even if he only wants to send 0.25 BTC.
Since we already know that the transaction fee equals the sum of the inputs sub-
tracted from the sum of outputs, the sender would have paid 0.05 BTC fee in
this transaction. We can cleary see this in figure 2.09. A more dramatic exam-
ple would be if the sender has received 1 BTC at his address in a single trans-
action and wants to send 0.001 BTC. We are going to show this through figure
2.10.
As we can see in figure 2.10, the sender would have paid a fee of 0.999 BTC for
2.3 Privacy in Bitcoin 15
transferring 0.001 BTC. This all happens because in it in Bitcoin always the full
UTXO has to be spent.
Typically the fee should not be determined by the value of UTXO, it
rather should be dynamically adjusted in consideration of the network traf-
fic.
To solve this problem the Bitcoin protocol uses change addresses, often also men-
tioned as refund addresses. Change or refund addresses are Bitcoin addresses which
are automatically generated by a wallet software [31]. They receive the change of a
sent transaction. Whenever the sent outputs would not exactly match the amount
the sender wants to send, the change will be returned to the change address.
For privacy reasons the change address is typically after every transaction newly
generated. Since fees of transactions are usually calculated dynamically, most of the
transactions make use of change addresses.
The whole process of generating and managing change addresses is handled by the
wallet software.
As we can see in figure 2.11, the sender sent 0.001 BTC through Transaction 0.
He paid a miner fee of 0.003 BTC and received a change of 0.996 BTC at a newly
generated change address. Through this address he sent another payment (TX 2 )
of 0.25 BTC, for which he paid a fee of 0.005 BTC. The change of this transac-
tion (0.745 BTC) has been saved in another newly generated address as UTXO
and can be spent in further transactions. However, this example is still not quite
realistic, since the fees are typically dynamically generated and specified up to 8
decimals.
We saw through figure 2.10 that the use of change addresses is essential to adjust
the miner fee in a precise way. However, we will show through our next example
how this system corrupts the idea of enhancing the privacy of using newly generated
16 2 Background
In figure 2.12, we analyze three transactions send by the same person with a real-
istic choice of dynamically generated fees. All of the three transactions use unique
addresses which were never combined in any transaction before. An attacker could
not distinguish whether these addresses are controlled by one person or by multiple
people. However, the attacker tries to identify the change addresses of these trans-
actions. At least in cases of Transaction 0 and Transaction 1 he could argue that
0.04921479 and 0.00447406 are the outputs send to the change addresses, since the
change is often lower than the sent amount whenever multiple inputs of similar size
are combined in a transaction. Even in Transaction 2, where only one input is given,
he could argue that 0.09960487 is sent to the change address, since 0.15 BTC seems
to be a more reasonable amount to be sent as a manual payment. It is important
to state, that these arguments can only be indicators to distinguish between change
and receiver addresses. Yet there is way more information that can be taken in
consideration. In most cases transactions are automatically generated. A further
analysis of transaction flows, the way of signing a transaction, the way of adding a
change address in an implementation and many more factors can be used to distin-
guish between change and receiver addresses. For simplification we will stick to our
basic indicators mentioned above and assume that we are able to distinguish between
2.3 Privacy in Bitcoin 17
change and receiver address. In our practical verification, we are going to show, that
our basic indicators are met in most real world cases.
We can clearly see in figure 2.13, that Transaction 3 uses two inputs which seem to
be changes of previous transactions. Since the sender of this transaction has to have
access to the private keys of both change addresses, he is the sender of Transaction
0, Transaction 1 and Transaction 3. While this case looks simplified, it is the ac-
tual implementation of the standard Bitcoin protocol which is implemented in most
Bitcoin wallets [9]. The change addresses are typically automatically combined in
the next transaction in a way that the following transaction is cheapest.
While the attacker now knows that each output address of Transaction 0, Transac-
tion 1 and Transaction 3 is owned by the same person, he does not know if there
may be more addresses connected to the owner. However, since the next transac-
tion, which uses UTXO 0.0035176 as input, will probably be connected again with a
change addresses, the attacker will gather more possible addresses which are owned
by the victim. With the elapse of time the attacker is able to gather more informa-
tion, which he can use to identify all addresses that are managed through the wallet
used by the individual.
The privacy could be enhanced by either not using change addresses or not combin-
ing any UTXO. We already discussed that both approaches would be very expensive.
A possible solution could be to use different Bitcoin wallets. Each wallet would con-
trol multiple addresses. In this case the amount of each wallet could be determined
by an attacker, but, as long as the wallets won’t get combined, the full amount of
all wallets cannot be determined. While this approach could enhance the privacy in
some situations, it still got huge drawbacks in practical use, which we are not going
to discuss in detail.
We can finally say that the approach Satoshi Nakamoto described in his whitepa-
18 2 Background
per enhances the privacy of the Bitcoin network in a coarse way, but still an
attacker is able to identify and track a specific user by simple blockchain analy-
sis.
In figure 2.14 we can see a typicall mixing process, which is normally initiated by the
sender. He sends Bitcoins which can be traced back to him (tainted coins) into the
mixing network and receives anonymized (untainted ) Bitcoins by the network. An
attacker neither should be able to trace the origin of untainted Bitcoins nor should
he be able to follow the tainted coins to identify the untainted coins. The tainted
and untainted coins should not rely in any connection which could be analyzed by
an attacker. In recent years there have been developed multiple mixing methods.
We will divide them into three categories.
Figure 2.15 shows the structure of a P2P mixing service. Multiple scientific papers
like [33] [44] or [26] about algorithms, which could make it possible to anonymously
2.4 Mixing techniques 19
transfer Bitcoins, have been published. Some of those algorithms have been imple-
mented in different cryptocurrencies by default (e. g. Zerocash [42]), but at the time
of writing, none of those approaches have been widely adopted in the Bitcoin net-
work. Unlike Centralized Mixing Services (CMS), P2P mixing has to be implemented
in Bitcoin wallet software to be accessible by users.
In figure 2.16 we can see the structure of a centralized mixing service. Centralized
Mixing services (CMS) are typically provided through a website.
While P2P mixing is based on no central instance, centralized mixers are typically
run by a commercial website provider who advertises that his service is able to
anonymize Bitcoin transactions.
The customer specifies addresses where he wants to receive the anonymized Bitcoins.
After that, the customer sends Bitcoins to an address which the CMS individually
generated for him. When the transaction has been confirmed by the network and
an optional specified delay has been waited, the anonymized coins are sent to the
customers’ addresses. For providing this service, CMS typically charging a fee up
to 3% of the initial amount of untainted coins. Often P2P mixing algorithms are
internally used by CMS.
cannot perform blockchain analysis. However, since the Lightning Network is still
under development and even it’s layered structure is not finalized yet we won’t
discuss it in great detail. [38]
Besides the Lightning Network there are more approaches which should enhance
privacy and do not fully rely on the original Bitcoin blockchain. Especially
Tumblebit [26] and the use of sidechains (Drivechains ) [17] should be mentioned
here.
3 Centralized Mixing Services
In this chapter, we are going to provide general information and attacking possibil-
ities against CMS. Most of CMS advertise, that they are able to anonymize their
customer’s Bitcoin payments.
3.1 Advantages
While no decentralized mixing technique is widely adopted yet, there exist several
CMS which are frequently used. The user does not have to install any software
or execute any script to use CMS. Typically no registration is required and the
mixing process can be easily started through a publicly accessible website. Further-
more, the customer is often able to change optional settings (e. g. time delay),
which should enhance the privacy. Since no specific software is required to use
CMS, the services can also be used by customers which use online Bitcoin wal-
lets.
3.2 Disadvantages
The main disadvantage of CMS are, that they are centralized commercially driven
services.
Typically, these mixing services can be accessed through a publicly accessible web-
site. None of the service providers is personally known. Since the customer has to
send assets in form of Bitcoins to the service, he can easily be defrauded. Further-
more, logs of the mixing process could be stored, which could lead to an exposure
of personal data which are not even stored in the blockchain (e. g. IP address, user
agent). Some CMS publicly state which internal mixing method they are using,
while others don’t mention how they are mixing the customer’s coins. A fully open
source script of a Bitcoin mixing service could not be found.
Unlike P2P mixing protocols, which are typically published through scientific pa-
pers, CMS can only be attacked as a black-box. Since all transactions are perma-
nently stored in the Bitcoin blockchain, an implementation bug could easily lead
to the deanonymization of every transaction ever processed by the service. Even
if a transaction is anonymized through a mixing service, this transaction may be
deanonymized in future.
While some of these drawbacks also apply for on-chain P2P mixing algorithms,
they are more harmful to CMS, since their implementation cannot be publicly
reviewed. Some CMS make use of P2P mixing techniques, however, the imple-
22 3 Centralized Mixing Services
B - An attacker who has access to the Bitcoin network and is able to start
mixing procedures through the centralized mixing service. He is able to
send tainted Bitcoin to the mixing service and receive untainted coins by the
mixing service.
C - An Attacker who has access to the Bitcoin network and is able to forge
multiple nodes. Based on the P2P network, this usually requires multiple IP
addresses (IPv4 and IPv6) and a powerful server.
In all mentioned attacking scenarios, we assume that the attacker is able to retrieve
publicly accessible statistics which may be published through the centralized mixing
service.
There are different approaches to analyze these data. While blockchain analysis is a
powerful attack which can be used for various purposes, it is often mistaken with a
taint analysis. Through a blockchain analysis every accessible blockchain data can
be used to gather the wanted information, while a taint analysis is only focused on
the transaction flow.
In case of identifying specific implementations (e. g. mixing services) in the Bitcoin
network, side-channels are important to mention. Through side-channels specific
information of transactions (e. g. time, size, way of signing) can be used to identify
a specific implementation. When attacking CMS it’s often crucial to identify the
centralized mixer as a subnetwork in the Bitcoin network.
As we already have mentioned, there are several ways to construct correct Bitcoin
transactions. Through the blockchain analysis an attacker is able to analyze the
specifics of black-boxed mixing services. His aim is to distinguish whether a trans-
action is part of the service’s subnetwork.
After he achieves this, he tries to deanonymize the transaction. Through a single
side-channel, which is able to identify the service’s subnetwork, the whole centralized
mixing service could be broken. This side-channel could lead to the deanonymization
of all transactions the service ever processed.
Till early 2017 blockchain.info provided a service which visualized the taint of an
address. The so called Taint Analysis evaluates the associations between multiple
addresses and shows how strong the links between them are [35]. The analysis
was introduced to evaluate how much anonymity a specific mixing service pro-
vides. The mixing service should provide untainted coins. Which in fact means,
that the customer’s input addresses should not be connected to his output ad-
dresses.
This analysis has often been used to determine how good a mixing service anonymizes
transactions [34].
While a taint analysis only takes direct connections between addresses into account,
other approaches like sophisticated blockchain analysis or statistical tests are able
to use side-channels to find associated addresses which are not directly connected.
In this way a blockchain analysis could lead to a deanymization of transaction’s
addresses which are not tainted.
It is controversial discussed if taint analysis is a suitable tool to determine a trans-
action’s origin since exchanges and other services in the Bitcoin ecosystem create
new links between addresses.
The Taint Analysis function, introduced by blockchain.info, has recently been re-
moved. At the moment there is no known publicly available tool which is able to
perform a taint analysis.
24 3 Centralized Mixing Services
CMS are typically accessible though a website. Web security bugs and vulnerabil-
ities could compromise the service. We are going to discuss some vulnerabilities
which could lead to deanonymization of customers.
3.4 Possible attacks 25
It should be mentioned, that this list is not complete. It is important to know, that
based on the attacker model, even bugs which does not seem to be vulnerabilities
could be used in combination with other attacking approaches like blockchain anal-
ysis or sybil attacks to deanonymize customers.
While in common attacking scenarios it might not be the main goal to gain access
to access logs, in case of attacking a centralized mixing service, this could easily lead
to the deanonymization of every customer.
3.4.4 DDoS
Requires attacker: A and resources to successfully execute a DDoS attack.
DDoS attacks can harm CMS and even be used to compromise the customer’s pri-
vacy. The simplest case is, when DDoS is being used to block customers of using
CMS.
More interesting are cases where through a DDoS attack an attacker is able to gather
information about the black-boxed system. Since the security of CMS is often based
on time delays, a DDoS attack could influence the mixing process which leaks infor-
mation to identify customers.
If a node did not specify static connections, a DDoS attack could also be used to
successfully carry out a sybil attack. The victim could reconnect to the forged
attacker’s nodes.
26 3 Centralized Mixing Services
We discussed already the RBF and other features which allow a sender to update
and replace a Bitcoin transaction. These features may lead to vulnerabilities, if
CMS sent their untainted coins to the customer without waiting till the customer’s
transaction have been included in the blockchain. Since the user is able to change the
transaction, the mixing service could be sending untainted coins without receiving
the user’s taint coins.
In general, it should be waited at least for three confirmations, before an incoming
transaction is evaluated as confirmed [5].
Stale blocks are Bitcoin blocks which fulfill all requirements to be evaluated as
a valid Bitcoin block, but they are not part of the main Bitcoin blockchain
[9].
In figure 3.1. we can see a stale and an orphan block. Normally stale blocks are
created whenever at least two miners were able to mine a block at the same time
[25]. While both blocks are valid, only one of them can be added to the blockchain,
since the main blockchain does not allow to have multiple blocks at the same block
3.4 Possible attacks 27
height. Whenever multiple blocks were mined at the same block height, miners can
choose on which block they want to mine their next block. Typically miners choose
the block they received first [5]. The block which is not mined on, is the stale block
which won’t be part of the main blockchain.
Since the stale block is not part of the main blockchain, none of the transactions
included in this block are confirmed. However, the transactions can still be a part
of other blocks in the main blockchain.
A miner could exploit this behavior by willingly creating blocks which are going to
be stale blocks later on. The miner would be able to include several transactions
in his stale block which will be confirmed through his block, knowing they will be
reversed later on. In case of the main blockchain, the transactions have never been
made. If a mixing service does not wait for multiple confirmations, it may send
the untainted coins to the attacker. While the attacker’s input transaction will be
ignored, since it is only available in a stale block, the output transaction will be
included in the main blockchain.
However, it is hard to willingly create a stale block, which will be accepted by the
network but later on ignored. Furthermore, the mining reward will be lost. For
successfully carrying out this attack there may be a sybil attack prior to the mining
necessary. In this case the attacker controls which blocks the centralized mixing
service is able to receive and would be able to disconnect the service from the main
Bitcoin network till he mined his own stale block. The same drawbacks we described
in the sybil attack apply for this attacking approach.
Requires attacker: B
When a stale block was created there is typically a wipe out of the short blockchain.
However, this is not always true. In case of a planned fork, miners could still be
working on the shorter chain. This happened on 01.08.2017 through the Bitcoin
cash (BCH) fork. Since both chains are valid, the UTXO which were sent before
the chain-split are valid on both chains. If the forked chain does not have any
replay protection, it is possible to spent UTXO on the old chain and replay this
transaction to the forked chain. And vice versa.
However, CMS are typically only connected to one of these chains. If no replay
protection is applied, an attacker is able to replay the transaction, which the CMS
sent on one chain, to the other chain. The attacker will receive coins on both
chains, since he owns the private key on both chains, but in fact he only sends his
coins on one chain to the centralized mixer.
Typically the forked chain should implement a replay-
protection.
28 3 Centralized Mixing Services
3.4.6 Conclusion
As we can see, there are multiple attacking possibilities against CMS. The imple-
mentation has to be secured against Bitcoin network weaknesses and web security
vulnerabilities. Furthermore, a secure mixing algorithm has to be implemented.
Even if all of these layers are implemented in a secure fashion, they could lead to
side-channels when they are combined. A centralized mixing service should be se-
cured against all of these vulnerabilities. It should be able to automatically process
transactions, without leaking information about the mixing process. An attacker
should not be able to differentiate between transactions, which are connected to the
mixing service and other transactions found in the blockchain.
A general drawback of CMS is, that their service is commercially driven and not open
source. The customer has to trust the service provider.
4 Attack on coinmixer.se
With a mixing volume of around 120 Bitcoins per week, coinmixer.se is probably
one of the most frequently used centralized Bitcoin mixing services available [6].
In this chapter we are going to implement an attack on this service. Our aim is
to create a tool, which allows us to deanonymize transactions which priorly were
anonymized by coinmixer.se.
Coinmixer.se publishes every week the amount of mixed Bitcoins and the number
of performed anonymizations on their website.
Even though the customer is able to specify multiple forward addresses, he will
always have to pay the untainted Bitcoins to a single address controlled by coin-
mixer.se.
Figure 4.2: The default case of a mixing process and a case which makes use of the
optional settings
of mixing processes it has performed in the last week. As the time of writing the
service processed only around 1300 mixings in the last week (2017-10-20 00:00:00
UTC - 2017-10-26 23:59:59 UTC ) [6].
Since the number of processed mixings is small, our main attack will focus on map-
ping customer’s input transactions to coinmixer’s output transactions based on the
amount of sent untainted Bitcoins.
To achieve this, we need to accomplish three steps:
[38]. When the SegWit feature is widely adopted the maximum number of TPS
should increase [38].
With a theoretically maximum of 7 TPS we achieve a weekly maximum of 4.233.600
transactions. This number of transaction would be sent in 1008 blocks with a block
generation time of 10 minutes.
However, 4.233.600 transactions per week is the theoretically maximum. In last
week (2017-10-20 00:00:00 UTC - 2017-10-26 23:59:59 UTC) only around 2.182.236
transactions have been processed through the Bitcoin network, while the average
blocksize was even bigger than 1 MB [1]. Through the SegWit feature miners are
able to create blocks which are bigger than 1 MB. [38]
Either way, our implementation would need to identify a maximum of 7800 trans-
actions in more than 2 million transactions.
For easier understanding we are using the term "fee" when we refer to "fee per
byte", knowing that a transaction’s fee is in fact a different value.
removed.
376.567 of 2.150.927 transactions used version 2 and specified a locktime (17,51 %).
362.709 of 2.150.927 transactions used version 2, specified a locktime and set the
sequence numbers of every input to 4294967294 (16,86 %)
As we can see, we were able to filter 362.709 (16,86 %) out of 2.150.927 transactions,
which fulfill the mentioned indicators to be coinmixer.se transactions.
Still, the indicators do not seem to filter enough transactions, since, based on
our assumption, a maximum of 6825 transactions could be sent by the service.
Since every transaction we received from coinmixer.se met the indicators (version,
locktime, sequence number), we will refer to them as strong indicators.
However, the fee indicator might be an even better way to determine coinmixer’s
output transactions, since Bitcoin clients typically use a dynamically fee adjustment.
[5]
Through dynamical fee adjustment, the client calculates the fee based on the
Bitcoin network traffic, while coinmixer’s fees seem to be fixed and only adjusted in
a big scale. While the analysis of version, sequence and locktime is pretty easy to
accomplish, for analyzing the fee it must be taken into account that the fee might
change in future.
We have to reduce the block range to be able to obtain reliable results, since we
have to be sure that the fee does not change within our testing time frame.
We received 14 transactions in the time range between 2017-09-26 00:58:49 UTC
and 2017-09-28 01:37:46.
As we can see in table 4.5, all transactions we received in that time frame have been
sent with the same fee. The first transaction we received at 2017-09-26 00:58:49 UTC
was included in the Bitcoin block with height 486977. The last one we received
at 2017-09-28 01:37:46 UTC was included in the block with block height 487265.
We assume that in this period of time every transaction sent by coinmixer.se to
customers has a fixed set fee of 123 sat/Byte (±1).
531.558 transactions have been sent between block height 486977 and 487265. Sim-
ilar to our previous analysis, 17,19 % (91.382) of 531.558 transactions fulfilled the
version, sequence and locktime indicators. However, when we applied the fee in-
dicator and filtered the transactions to show every transaction which has a fee set
between 122 sat/Byte and 124 sat/Byte, we received a result set of 2.839 (0,53 %)
from 531.558 transactions. When we tightened up the fee indicator to only show
results where the fee is set to 123 sat/Byte, it results in a set of 1057 (0,19 %)
transactions.
Even though all of the transactions we received by coinmixer.se had the fee set to
123 sat/Byte, we realized in further analysis that in some settings the fee was off
by one sat/Byte. So we decided to stick with a variance of ±1 for our implementa-
tion.
When we apply all four indicators (version, locktime, sequence number, fee) to
the gathered transactions between block heights 486977 and 487265, we received a
filtered output of 2.839 transactions. Only 0.53 % of all transactions send at the
time fulfilled the mentioned indicators.
While the fee indicator seems to be good for filtering purpose, it should not be used
as a strong indicator, since the specified fee is able to change.
It should be stated, that we only used simple characteristics as indicators, which can
easily be spotted through a high-level comparison of standard Bitcoin transactions
sent through a Bitcoin client and coinmixer’s output transactions. On a low-level
4.4 Identifying coinmixer.se’s network 37
When we analyzed the sent coinmixer transactions, we were able to distinguish cus-
tomer’s and coinmixer’s transactions based on the characteristics we described in the
last subsections. Furthermore, the customer often only specified up to four decimals
and uses common values. We define a value to be uncommon, if it is specified to
more than four decimals (e. g. 0.9286472 BTC).
It seems like the coinmixer’s network sends output transactions to a customer, and
receives the change on a change address. After that, the change, sometimes com-
bined with other changes, is sent to a next costumer. The change is again saved on
a new change address and reused for a next customer. And so on.
Also we noticed that the costumer’s outputs are sometime unspent, while the change
addresses in all of our analyzed transactions were spent. Customer’s addresses could
also be differentiated from coinmixer’s by the previous and the following transac-
tion’s flow.
Some customers addresses received and sent multiple transactions, while coinmixer
addresses only received a single transaction and sent a single transaction.
Till now we spotted several indicators to differentiate between customer and change
addresses. They can be found in table 4.7.
Table 4.7: Indicators to distinguish between customer’s and coinmixer’s address
Based on these indicators we were able to follow the chain of change addresses, which
can be seen in figure 4.8.
4.4 Identifying coinmixer.se’s network 39
Through the analysis of the transaction flow we were able to identify the internal
mixing process. While we were able to identify possible customer’s and internal
addresses in a forward manner, it may be also interesting to take a look back. As
we already know, sometimes change addresses are combined. This could provide us
with more information about the network, since till now we only analyzed a single
chain of change addresses.
When we analyzed the input addresses, we recognized some addresses which are
probably used as cash-in addresses for customers and others that are used for
output transactions. The cash-in addresses could be identified, since they are
sent through the customer and typically did not fulfill the mentioned indica-
tors, while the input transactions which are used for output transactions do meet
them.
As we can see in figure 4.9, we supposedly identified a method to spot the coin-
mixer.se network. However, we didn’t confirm it yet. A way to verify our assump-
tions would be to create the network based on a given input transaction and check
if another unique input transaction we made after that can be found in the created
network.
40 4 Attack on coinmixer.se
In figure 4.10 we perform two unique mixing processes to confirm the constructed
coinmixer network. We receive through Our output address 1 the untainted coins
for the tainted coins of Our address 1. To confirm the network, we did start a second
mixing process. Based on the second output address and the known indicators, we
reconstructed the coinmixer network.
If our first transaction can be found in this network, we assume that we were able
to reconstruct the correct coinmixer network.
While we were able to describe the method to reconstruct the coinmixer.se’s network
based on the spotted indicators, we want this automatically to be done. To achieve
this, we are going to implement a crawler which is able to reconstruct the network
based on a given coinmixer’s output transaction.
Based on the described method above, we are going to verify in chap-
ter Results that our crawler is able to create the correct coinmixer.se net-
work.
4.5 Crawler
As we already described in the previous section, the crawler should be able to create
the coinmixer network based on a given coinmixer’s output transaction. To imple-
ment this we are going to implement two different ways of crawling. The forward
crawling, which takes a coinmixer’s output transaction and follows the change ad-
dresses till it reaches the end of the coinmixer network and the backward crawling,
which also takes a coinmixer’s output transaction but analyzes the input addresses
to find previous transactions and addresses which are part of the coinmixer’s net-
work.
The process of creating the whole network should be done as we can see in figure
4.11.
4.5 Crawler 41
Through the forward crawling only a single chain (red) of transactions will be
found. When the endpoint of this chain is reached the crawler is going to stop.
The user should specify the last transaction (tx 3) as the starting point for the
backward crawling process.
The green colored transactions can be found through this approach.
Through this approach nearly all transactions which belong to the mixer should be
found.
But transactions, which were entirely spent before and don’t have any connection
with an address, which can be found in the crawled network, cannot be found.
However, this should be a rare situation, since the output has to exactly match the
customers specified output amount.
Also transactions which have not been connected to a change address chain yet,
cannot be found through this approach. But the chains are most probably getting
connected through further mixing processes.
As we can see in figure 4.12, some transactions (blue) might not be found through
a forward/backward crawling process. However, they can be found through a
blockwise crawling process.
42 4 Attack on coinmixer.se
It should be noted, that we did not specify any assumptions yet, how the service
is receiving the mixing fee. Based on the described network it could either be that
some of the output addresses are controlled by the coinmixer or a change address
chain leads to an address which is manually controlled by the coinmixing service.
We did not further examined these assumptions.
table. While most of the transaction data are stored in the transaction_data
table, we stored transaction data that belong to addresses in separate tables
(transaction_addresses, transaction_values ). We chose this way to store address
data, since transactions vary in the number of input and output addresses and for
most of our queries, the specific address data are not important.
This way of storing the transaction data should enhance the accessing performance.
Besides the mapping, we created two different tables to store transaction data.
Every transaction that’s processed through our script will either be stored
in transactions_size_normal or in transaction_size_big. Transactions which
exceed a length of 40.000 characters (JSON-formatted) will be stored in transac-
tions_size_big. Based on the average Bitcoin transaction size and the overhead
produced through the blockchain.info API we have chosen this size. Most transac-
tions should be stored to transactions_size_normal, since transactions_size_big
uses the mediumtext -type to store data.
We chose to store every processed transaction, otherwise the crawling processes
would need to request transactions which were already processed before. Since the
blockchain.info API blocks IP addresses after too many requests, our aim is to send
as less API requests as possible.
For performance reasons we divided received Bitcoin transactions in two size
categories Database structure.
The column indexing has been chosen based on the implemented MySQL statements.
input transaction specified by the user. The crawler won’t analyze a transaction
twice.
After the starting point has been identified, the crawler checks whether all strong
indicators are met. Since the forward crawling process follows the change address,
all strong indicators have to be met. If this check fails, most probably the user’s
specified transaction has not been sent by the coinmixer. The erroneous transaction
will be saved. The crawling process is aborted.
If the input transaction/last endpoint is a valid coinmixer transaction, it is stored
in the database and the main forward crawling process is able to begin.
Now the output addresses of the transaction are going to be checked. Based on the
indicators, it should be checked which of the addresses is the customer address and
which is the coinmixer’s change address.
The first indicator, which the crawler checks, is the number of sent transactions.
As we already described, a coinmixer’s change address should only send one
transaction.
While it is a strong indicator for being a customer’s address, if through an address
multiple transactions have been sent, it can also be an indicator for being a
customer’s address if the transaction’s output, which is being checked, is still
unspent. This is the case, because most of the coinmixer’s addresses will be used
in further mixing processes. However, this is only a low indicator, since the last
address of the coinmixer network will definitely be able to spend UTXO.
While strong indicators have to be fulfilled for being a coinmixer transaction, the
indicators we mainly introduced through the Identifying customer’s and coinmixer’s
transactions subsection should not only be relied on.
We implemented a counting system which helps us to distinguish between a
customer’s address and a coinmixer’s change address. We specified a value for each
indicator which will be added to a counter if it’s met. After both addresses have
been checked, the crawler will evaluate which address has the highest counter. The
address with the highest counter fulfills more indicators and is most probably the
change address.
As we can see in table 4.14, there might be a situation where both addresses result
in the same counter. However, this situation should occur very rarely and we did
not experienced it in any testing scenario. More about this situation we described
in Future Work. An uncommon value is a Bitcoin amount which is specified to at
least five decimals (e.g. 0.98263890 BTC).
We are going to describe the fee indicator in a little more detail. As we mentioned,
the fee of the coinmixer’s transactions are most probably static. But, they are able
to change manually. Since they are able to change, it forms the fee indicator only
in some cases to a good indicator for identify a coinmixer’s transaction.
As we can see in figure 4.15, the set fee of coinmixer transactions can most probably
be divided into static partitions.
Since we are able to start the crawling process with any coinmixer trans-
action, the transaction is able to lay in different spots of these fee parti-
tions.
In figure 4.16 we can see a transaction which is located after the newest fee partition.
This is the case if the processed transaction is the newest coinmixer transaction the
crawler has ever seen. While the fee indicator is still a reliable indicator, it might
be wrong, since a new fee partition could be created. The counter is increased by 3
as long as the fee of the transaction is set correct.
As we can see in figure 4,17, it is also possible that the processed transac-
tion is older than every transaction the crawler ever has seen. This situation
is handled similar as the situation described above. The counter is increased by
3.
In figure 4.18 we are seeing a transaction which is located inside of a partition. This
case is a good indicator for being a coinmixer’s transaction, since the fee should not
be able to change within a partition. The counter is increased by 4 (3+1), if the
transaction’s fee meets the partition’s fee.
As we can see in figure 4.19, the transaction may be located within a gap of
partitions. When the transaction’s fee is located between partitions it is also a good
indicator, since we assume that there are no huge gaps between the partitions. The
counter is increased by 4 (3+1), if the transaction’s fee meets one of the surrounding
partition’s fees.
In figure 4.20 we can see a partition which got extended, after a processed transaction
was located within a gap of partitions.
We won’t get into more details of other indicators, since we already discussed them
before.
The results of the counting process will be stored in the database in coin-
mixer_analysis_log and coinmixer_analysis. While in coinmixer_analysis_log the
results of each address analysis are stored, coinmixer_analysis_log primary logs
which transaction’s hash already has been processed and should not be processed in
further crawling processes.
After the crawler determined which of the output addresses is the change address,
he follows the change address’s sent transaction and starts the analysis of it. In this
way the crawler is able to follow the change addresses and create the coinmixer’s
network based on them.
In figure 4.21 the user provides tx1 as the coinmixer’s output transaction. Starting
48 4 Attack on coinmixer.se
from there, the crawler should be able to identify tx2, tx3 and other connected
previous transactions.
The backward crawling process is primary based on the forward crawling process.
However, some specific adjustments were needed to be implemented. While the
forward crawling process checks whether an output address is controlled by the
coinmixer, all input addresses of the transaction have to be controlled by coinmixer.
If the inputs of the transaction would not be controlled by the coinmixer, it would
not be a coinmixer transaction. The main task in the backward crawling process
is not to identify who controls the inputs, but rather to identify who controls the
inputs of the inputs.
In figure 4.22 the crawler checks whether an input address is a cash-in address or a
change address from a previous mixing process. If it’s a change address, this path
will be followed.
The user is able to specify a maximum crawling depth. This is the depth a single
chain of change addresses will be followed. It is recommended to choose a moderate
maximum depth (< 50), since one of the change address chains most probably will
lead to the first coinmixer transaction the crawler is able to find. After a backward
crawling process is interrupted or the maximum depth is reached, the crawler will
find the last transactions processed in each chain and continues the crawling process
from there.
While the backward crawling works similar to the forward crawling, there is one
essential difference between both processes. At the forward crawling process only
two outputs have to be checked. Based on our assumptions, one of the outputs has
to be a coinmixer’s change address, while the other address is a customer’s address.
However, there is no restriction how many input addresses are cash-in addresses and
how many are change addresses. We are not able to distinguish between these types
of transactions based on a counter.
The crawler categorizes transaction’s input addresses as coinmixer address only if
4.5 Crawler 49
4.6 Deanonymization
As we described in Attacking Method, the deanonymization process is not go-
ing to take the internal structure of the coinmixer into account. Our imple-
mentation will deanonymize transactions based on the sent and received Bitcoin
amount.
In subsection Mixing fee we described how the input of the coinmixing service is cal-
culated. Since we identified the coinmixer.se network through the crawling process,
we are now able to map every possible coinmixer output to a given input transac-
tion. Vice versa.
A mapping provides the deanonymization of a given anonymized transaction. The
mapping is based on the equation mentioned in Mixing fee.
We implemented a deanonymization process as a Proof of Concept. Based on a given
4.7 Results 51
4.7 Results
In the last subsections we described how we implemented the crawling
and deanonymization processes. Now we examine how good our im-
plementation works in practice. Our practical verification is based on
the same transactions mentioned in Characteristics of coinmixer’s output
transactions. The first transaction which we are going to analyze is
1e11f70d0db8c177a19ebdbc782e0b7bfddaef3e314f7b339f702bd76d76b76f. We have
sent this transaction to coinmixer.se on 2017-09-26 18:36:02 UTC as an input trans-
action for a mixing process. This transaction has been confirmed by the Bitcoin
network through the main chain block at height 487072. We will later describe more
specifics of the settings, which we have chosen for this mixing process.
We chose the mentioned transaction as the starting point for the forward crawling
process.
It should be stated, that typically the forward crawling process starts with a given
output transaction, however, since our input transactions has been used without any
big time delay in an output transaction sent by the coinmixer, we are also able to
use our input transaction as starting point.
While we have sent the tainted coins through our input transaction at 2017-09-26
18:36:02 UTC, the next transaction, which uses our tainted coins, was sent on 2017-
09-26 19:40:38 UTC. Since the transaction at 2017-09-26 19:40:38 UTC is the first
coinmixer transaction which uses our tainted coin, this transaction should be seen
as the initial starting point of the crawled network.
To be able to crawl as much transactions as possible, we start with the forward
crawling process and after that apply the backward crawling process.
The last transaction, which we received on an address controlled by us from coin-
mixer.se, has been sent on 2017-09-28 01:37:46 UTC. An analyzes of the crawled
network until that point of time would be enough to deanonymize all of our input
transactions, however, we let the crawling process proceed to identify more of the
whole coinmixer.se network. We stopped the crawling process at a transaction which
was sent on 2017-10-11 18:08:41 UTC.
The forward crawling process was able to identify 486 transactions which were
sent in the time frame from 2017-09-26 19:40:38 UTC to 2017-10-11 18:08:41 UTC.
Based on our assumptions, all of these transactions belong to the coinmixer.se net-
work.
52 4 Attack on coinmixer.se
Figure 4.23 shows 20 of 486 crawled transactions. Since the rest of the graph looks
the same, we only provide this excerpt. Every node should correspond to a valid
Bitcoin transaction. The blue nodes should show change address transactions sent
by coinmixer.se and the red nodes should show transactions sent by customers.
In case of coinmixer transactions this is true, however, in case of customer’s
transactions there also could be cases where a customer received untainted Bitcoins
but did not spent them yet. Yet still, we created also for these cases (red) nodes,
since otherwise the network infrastructure might not be easily understandable.
Generally speaking, blue nodes show transactions which have been made by the
coinmixer, while the red nodes show transactions which were spent by customers or
can be spent by them in future.
We were able to identify 486 transactions, however, based on the published mixing
statistics of coinmixer.se more than 1000 coinmixer transactions have been sent in
this time frame. To identify more transactions sent by the coinmixer, we need
to run a backward crawling process. As starting point of the backward crawl-
ing process we choose the endpoint of the forward crawling process and a crawl-
ing depth of 20. The endpoint transaction of the forward crawling process was
2430b04b47520764d82422e696c1f39c85ca568f381729e427eea2b9c12c6190 (2017-10-
11 18:08:41 UTC).
The earliest transaction which our backward crawling process was able to find was
4.7 Results 53
sent on 2017-08-28 19:23:22 UTC, which is way before the time frame we want to
analyze. However, since the timestamp of every transaction is stored by the crawling
process, filtering can easily be accomplished through a MySQL query.
Our backward crawling process was able to identify 3609 transactions. We cannot
specify any specific time frame for these transactions, since they depend on the depth
of recursion.
To simplify our graph we removed all customer transactions. Figure 4.20 only shows
coinmixer’s change address transactions.
Based on forward and backward crawling we were able to create coinmixer.se’s net-
work. However, we still need to verify that the found network is the actual coin-
mixer.se network. We do this through the approach we described in Identifying
customer’s and coinmixer’s transactions.
We received 14 transaction’s from coinmixer.se. We were able to identify all of them
in the crawled network.
54 4 Attack on coinmixer.se
We are going to analyze how reliable the results of our implementation are by trying
to deanonymize transactions which we priorly anonymized with the coinmixer. At
first we will focus on an anonymization process where we chose one forward address
and a time delay of 0 hours.
Since we sent the minimum possible amount of 0.001 BTC, this input could not be
divided in multiple output transactions.
Our tool was able to identify 13g89ys797GE9QtP1kj8Nv1gDfYFU1nps4 as our out-
put address. If the attacker knows, that the mixing process took place within 12
hours after sending the input transaction, he would be able to deanonymize the
transaction without any false positives. The deanonymization of the transaction
would be successfully accomplished.
If the attacker does not have any information about the chosen time delay, our tool
would return 11 results with 10 false positives.
It should be noted, that even if we would have specified a time delay to up to
12 hours at the mixing process, the deanonymization results would still be the
same.
As we can see, the deanonymization results are varying with the amount of infor-
mation known by the attacker. Even for the worst case of 120h delay, 10 false
positives seem to be a good result for beeing filtered out of 2 million Bitcoin trans-
actions.
56 4 Attack on coinmixer.se
For most use cases the use of multiple forward addresses does not seem to be useful,
since typically the customer only wants to anonymize the coins and do not want to
split them across different addresses. Furthermore, the use of multiple addresses,
which are managed through the same Bitcoin wallet, do not enhance the privacy
(see Privacy in Bitcoin). The forward addresses should never be combined again.
Nevertheless, we are trying to deanonymize a coinmixer.se participation where we
specified three forward addresses. Our implementation has to check every possible
combination of transactions which might be the possible deanonymization for the
specified input transaction.
We tried to deanonymize a transaction with three forwards. We specified 0.001
BTC, 0.001 BTC and 0.00100001 BTC as the amounts of untainted coins we want
to receive. As time delays we chose values between 0 and 7 hours.
First we wanted to check if there may be a single transaction which could be identi-
fied as a false positive for our input transaction. None false positive has been found.
In the next step we tried to identify possible false positives for two forwards. Every
combination of two forwards would be a false positive, since we chose three forwards
in the mixing process. Our implementation was able to identify 33 different Bitcoin
addresses as false positives for two forwards and a possible delay of 120 hours. If the
attacker would know that the mixing is done within 8 hours after the initial input
transaction, no false positives would be found.
When our implementation checked all possible combinations of three forwards
with no knowledge about the time delay, it could identify 24 different Bitcoin ad-
dresses. This test was done with a maximum delay of 120 hours. If the attacker
would know that the mixing has been done within 8 hours after the input trans-
action, our implementation would be able to identify 9 possible output transac-
tions.
It is important to state, that in the last testing scenario, six addresses which were
identified as false positives, were addresses under our control. These addresses were
output addresses of other testing cases we have done. These addresses have been
identified as false positives, since we used the same output amount of 0.001 BTC
for multiple testing cases. Since these six addresses are artificially produced by our
testing procedures, we need to ignore them in our results.
When we ignore them, we receive the results shown in table 4.27
for the deanonymization process of a mxing process using three for-
wards:
Table 4.27: Results for second testing case (up to 7h delay, 3 forwards)
4.7 Results 57
If the attacker knows that the mixing process has taken place within 8 hours, he
would be able to deanonymize the transaction by finding all three output addresses
without any false positive.
If he does not have any information about the set mixing options, the result set of
two forwards would contain 24 false positives and the result set of thee forwards 15
false positives.
It should be mentioned, that our results only show found unique ad-
dresses, however, there are multiple combinations of these addresses possi-
ble.
Generally speaking, our implementation is able to identify coinmixer outputs which
are based on a similar input. If and how many false positives are going to be found
through our implementation is primarily based on the mixing behavior of other
customers. If we are the only customer who tries to anonymize a specific amount of
Bitcoins, our tool will be able to identify the exact output transaction even with no
knowledge about the specified time delay. However, if multiple customers are mixing
the same amount of Bitcoins at the same time, the results are going to contain
multiple false positives. In our testing scenario we were able to deanonymize our
input transaction without any false positives when the time frame of the mixing
procedure could be limited.
5 Conclusion
Through our implemented crawling methods we were able to filter 3609 out of more
than 2 million Bitcoin transactions. Most probably all of these 3609 transactions
define the coinmixer.se network in a specific time frame. Our crawler accomplished
this through a simple blockchain analysis. The identified network could be verified
through multiple transactions we received by coinmixer.se. Based on our crawled
network it can be assumed, that in our testing time frame most of the coinmixer.se’s
customers did not use the optional setting of specifying multiple forward addresses.
Our implementation was able to deanonymize two transactions which we priorly
anonymized through coinmixer.se. When we set the maximum time delay to the
time delay specified in the mixing process, the deanonymization results were correct
and did not contain any false positives.
Based on our analyzes we can conclude, that the anonymization process of coin-
mixer.se is heavily based on the mixing behavior of other customers. While our
implementation was able to deanonymize our testing cases, it does not necessarily
mean that every transaction of coinmixer.se could be deanonymized through our
implementation. However, a customer cannot be sure if the coins he received could
easily be deanonymized. Practical use cases where time delay to up to 120h and
multiple forward addresses should be used are limited. Based on our analyzes the
mixing process of coinmixer.se is not able to provide reasonable privacy for practical
use.
[41].
Furthermore, CoinJoin [30], Mixcoin [14], Coinparty [44], Xim [12] and Tumblebit
[26] have been introduced as mixing protocols.
The implementation of the Lightning Network may enhance privacy of the Bitcoin
network. [38] Analyzes of centralized Bitcoin mixing services and privacy enhanc-
ing overlays also have been published. Malte Möser, Rainer Böhme and Dominic
Breuker analyzed Bitcoin Fog, BitLaundry, and the departed mixing function of
Blockchain.info as centralized mixing services through a taint analysis [35]. Sarah
Meiklejohn and Claudio Orlandi analyzed multiple mixing algorithms and services
like coinjoin [32].
[11] Lear Bahack. Theoretical bitcoin attacks with less than half of the computa-
tional power (draft). arXiv preprint arXiv:1312.7013, 2013.
[12] George Bissias, A Pinar Ozisik, Brian N Levine, and Marc Liberatore. Sybil-
resistant mixing for bitcoin. In Proceedings of the 13th Workshop on Privacy
in the Electronic Society, pages 149–158. ACM, 2014.
[13] Rainer Böhme, Nicolas Christin, Benjamin Edelman, and Tyler Moore. Bitcoin:
Economics, technology, and governance. The Journal of Economic Perspectives,
29(2):213–238, 2015.
[14] Joseph Bonneau, Arvind Narayanan, Andrew Miller, Jeremy Clark, Joshua A
Kroll, and Edward W Felten. Mixcoin: Anonymity for bitcoin with account-
able mixes. In International Conference on Financial Cryptography and Data
Security, pages 486–504. Springer, 2014.
[15] Grace Caffyn. What is the bitcoin block size debate and why does it mat-
ter. URL: http://www. coindesk. com/what-is-the-Bitcoin-block-size-debate-
and-why-does-it-matter/(visited on 27/11/2015), 2015.
[16] Bitcoin Core. Segregated witness benefits. URL https://Bitcoincore.
org/en/2016/01/26/segwit-benefits/.[Online, 2016.
[17] Who Has Custody. Optimizations, confirmation, contest and postlocking peri-
ods sidechain implementation using smartcontract in the secondary chain side
sidechain implementation using specific opcodes in the bitcoin side sidechain
implementation using turingcomplete scripting in the bitcoin side drivechain.
[18] Peter Todd David A. Harding. Bip 0125: Opt-in full replace-by-fee signaling,
2015.
[19] Christian Decker and Roger Wattenhofer. Bitcoin transaction malleability and
mtgox. In European Symposium on Research in Computer Security, pages 313–
326. Springer, 2014.
[24] Steven H Gifis. Dictionary of legal terms. Barron’s Educational Series, 2016.
[25] DA Harding. Bitcoin developer guide, 2015.
[26] Ethan Heilman, Leen Alshenibr, Foteini Baldimtsi, Alessandra Scafuro, and
Sharon Goldberg. Tumblebit: An untrusted bitcoin-compatible anonymous
payment hub. Cryptology ePrint Archive, Report 2016/575, Tech. Rep., 2016.
[29] Johnson Lau. Bip 0142: Address format for segregated witness, 2015.
[30] Greg Maxwell. Coinjoin: Bitcoin privacy for the real world. In Post on Bitcoin
Forum, 2013.
[31] Patrick McCorry, Siamak F Shahandashti, and Feng Hao. Refund attacks on
bitcoin’s payment protocol. In International Conference on Financial Cryptog-
raphy and Data Security, pages 581–599. Springer, 2016.
[32] Sarah Meiklejohn and Claudio Orlandi. Privacy-enhancing overlays in bit-
coin. In International Conference on Financial Cryptography and Data Security,
pages 127–141. Springer, 2015.
[33] Ian Miers, Christina Garman, Matthew Green, and Aviel D Rubin. Zerocoin:
Anonymous distributed e-cash from bitcoin. In Security and Privacy (SP), 2013
IEEE Symposium on, pages 397–411. IEEE, 2013.
[34] Malte Möser. Anonymity of bitcoin transactions. In Münster Bitcoin conference,
pages 17–18, 2013.
[35] Malte Moser, Rainer Bohme, and Dominic Breuker. An inquiry into money
laundering tools in the bitcoin ecosystem. In eCrime Researchers Summit
(eCRS), 2013, pages 1–14. IEEE, 2013.
[36] Malte Möser, Rainer Böhme, and Dominic Breuker. Towards risk scoring of
bitcoin transactions. In International Conference on Financial Cryptography
and Data Security, pages 16–32. Springer, 2014.
[37] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system, 2008.
[38] Joseph Poon and Thaddeus Dryja. The bitcoin lightning network. cit. on,
page 89, 2015.
[39] Dorit Ron and Adi Shamir. Quantitative analysis of the full bitcoin transac-
tion graph. In International Conference on Financial Cryptography and Data
Security, pages 6–24. Springer, 2013.
[40] Tim Ruffing, Pedro Moreno-Sanchez, and Aniket Kate. Coinshuffle: Practical
decentralized coin mixing for bitcoin. In European Symposium on Research in
Computer Security, pages 345–364. Springer, 2014.
[41] Tim Ruffing, Pedro Moreno-Sanchez, and Aniket Kate. P2p mixing and un-
linkable bitcoin transactions. IACR Cryptology ePrint Archive, 2016:824, 2016.
[42] Eli Ben Sasson, Alessandro Chiesa, Christina Garman, Matthew Green, Ian
Miers, Eran Tromer, and Madars Virza. Zerocash: Decentralized anonymous
payments from bitcoin. In Security and Privacy (SP), 2014 IEEE Symposium
on, pages 459–474. IEEE, 2014.
[43] Neudecker Till. Bitcoin cash (bch) sybil nodes on the bitcoin peer-to-peer
network, 2017.
[44] Jan Henrik Ziegeldorf, Fred Grossmann, Martin Henze, Nicolas Inden, and
Klaus Wehrle. Coinparty: Secure multi-party mixing of bitcoins. In Proceedings
of the 5th ACM Conference on Data and Application Security and Privacy,
pages 75–86. ACM, 2015.
A Database structure
--
-- Database: ‘Bachelorarbeit‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘address_and_value_mapping‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘coinmixer_analysis‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘coinmixer_analysis_log‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘coinmixer_graph‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘errorlog‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘fee_partition‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘list_of_all_transaction_hashes‘
--
--
-- Table structure for table ‘multiple_sequences‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘transactions_size_big‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘transactions_size_normal‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘transaction_addresses‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘transaction_data‘
--
-- --------------------------------------------------------
--
-- Table structure for table ‘transaction_values‘
--
--
-- Indexes for dumped tables
--
--
-- Indexes for table ‘address_and_value_mapping‘
--
ALTER TABLE ‘address_and_value_mapping‘
ADD PRIMARY KEY (‘ID‘);
--
-- Indexes for table ‘coinmixer_analysis‘
--
ALTER TABLE ‘coinmixer_analysis‘
ADD PRIMARY KEY (‘analysis_id‘),
ADD KEY ‘isBM‘ (‘is_cm‘);
--
-- Indexes for table ‘coinmixer_analysis_log‘
--
ALTER TABLE ‘coinmixer_analysis_log‘
ADD PRIMARY KEY (‘ID‘),
ADD KEY ‘transactionHash‘ (‘transaction_hash‘),
ADD KEY ‘previousHash‘ (‘previous_hash‘);
--
-- Indexes for table ‘coinmixer_graph‘
--
ALTER TABLE ‘coinmixer_graph‘
ADD PRIMARY KEY (‘AnalysisID‘);
--
-- Indexes for table ‘errorlog‘
--
ALTER TABLE ‘errorlog‘
ADD PRIMARY KEY (‘ID‘);
--
-- Indexes for table ‘fee_partition‘
--
ALTER TABLE ‘fee_partition‘
ADD PRIMARY KEY (‘partition_id‘);
--
-- Indexes for table ‘list_of_all_transaction_hashes‘
--
ALTER TABLE ‘list_of_all_transaction_hashes‘
ADD PRIMARY KEY (‘transaction_hash‘);
--
-- Indexes for table ‘multiple_sequences‘
--
ALTER TABLE ‘multiple_sequences‘
ADD PRIMARY KEY (‘transaction_id‘);
--
-- Indexes for table ‘transactions_size_big‘
--
ALTER TABLE ‘transactions_size_big‘
ADD PRIMARY KEY (‘hash‘),
ADD KEY ‘fullblock‘ (‘fullblock‘),
ADD KEY ‘blockheight‘ (‘blockheight‘,‘fullblock‘);
--
-- Indexes for table ‘transactions_size_normal‘
--
ALTER TABLE ‘transactions_size_normal‘
ADD PRIMARY KEY (‘hash‘),
ADD KEY ‘fullblock‘ (‘fullblock‘),
ADD KEY ‘blockheight‘ (‘blockheight‘,‘fullblock‘);
--
-- Indexes for table ‘transaction_addresses‘
--
ALTER TABLE ‘transaction_addresses‘
ADD PRIMARY KEY (‘ID‘),
ADD UNIQUE KEY ‘inputAdress‘ (‘transaction_address‘);
--
-- Indexes for table ‘transaction_data‘
--
ALTER TABLE ‘transaction_data‘
ADD PRIMARY KEY (‘transaction_id‘),
ADD UNIQUE KEY ‘transactionHash‘ (‘transaction_hash‘),
ADD KEY ‘isCM‘ (‘is_cm‘),
ADD KEY ‘blockheight‘ (‘blockheight‘) USING BTREE;
--
-- Indexes for table ‘transaction_values‘
--
ALTER TABLE ‘transaction_values‘
ADD PRIMARY KEY (‘ID‘),
ADD KEY ‘values‘ (‘transaction_value‘);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table ‘address_and_value_mapping‘
--
ALTER TABLE ‘address_and_value_mapping‘
MODIFY ‘ID‘ int(11) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=8429;
--
-- AUTO_INCREMENT for table ‘coinmixer_analysis‘
--
ALTER TABLE ‘coinmixer_analysis‘
MODIFY ‘analysis_id‘ int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=12501;
--
-- AUTO_INCREMENT for table ‘coinmixer_analysis_log‘
--
ALTER TABLE ‘coinmixer_analysis_log‘
MODIFY ‘ID‘ int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=4196;
--
-- AUTO_INCREMENT for table ‘errorlog‘
--
ALTER TABLE ‘errorlog‘
MODIFY ‘ID‘ int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=43;
--
-- AUTO_INCREMENT for table ‘fee_partition‘
--
ALTER TABLE ‘fee_partition‘
MODIFY ‘partition_id‘ tinyint(3) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=97;
--
-- AUTO_INCREMENT for table ‘transaction_addresses‘
--
ALTER TABLE ‘transaction_addresses‘
MODIFY ‘ID‘ int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=7539;
--
-- AUTO_INCREMENT for table ‘transaction_data‘
--
ALTER TABLE ‘transaction_data‘
MODIFY ‘transaction_id‘ int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=4065;
--
-- AUTO_INCREMENT for table ‘transaction_values‘
--
ALTER TABLE ‘transaction_values‘
MODIFY ‘ID‘ int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=6339;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
B Python Code
import urllib
import json
import MySQLdb
DEBUG = False
# =====================
# Database
# =====================
class Database:
"""
Creates Database-connections and forwards SQL-statements.
"""
db = MySQLdb.connect(host="127.0.0.1", user="xxxx", passwd="xxxxx", db="xxxx")
def __init__(self):
raise Exception("Should not be initialized")
@staticmethod
def sql_execute(sql):
"""
Executes SQL-Statement
cur = Database.db.cursor()
cur.execute(sql)
Database.db.commit()
row = cur.fetchall()
cur.close()
return row
# =====================
# ErrorLog
# =====================
class ErrorLog:
def __init__(self):
raise Exception("Should not be initialized")
@staticmethod
def log(data, transaction_hash=None, address_hash=None):
"""
Logs an occured error in database.
if transaction_hash is None:
transaction_hash = ""
if address_hash is None:
address_hash = ""
class ParticipationPossibility:
"""
A possible Participation. Its created whenever a search-process found a possible
participation-outcome.
"""
def __init__(self, outputtransaction, outputaddress, output_range, forwards):
self._outputtransaction = outputtransaction
self._outputaddress = outputaddress
self._output_range = output_range
self._forwards_number = forwards
def get_outputaddress(self):
return self._outputaddress
def get_forwards_number(self):
return self._forwards_number
# =====================
# ParticipationPossibilityContainer
# =====================
class ParticipationPossibilityContainer:
"""
This class holds a container with each possible participation-possibility.
Furthermore it executes the searching-process
"""
if self.transaction_time_minimum is False:
raise Exception("Block could not be found.")
def get_participations(self):
return self.container
if DEBUG:
print("trying to find all possible mixings. forward_maxiumum: " + str(
forwards_maximum) +
" maximum delay: " + str(max_delay))
if DEBUG:
print("transaction-time minumum: " + str(self.transaction_time_minimum) + "
maximum: "
+ str(self.transaction_time_minimum))
if DEBUG:
print ("found " + str(len(transaction_list)) + " transaction in timeframe")
return self.container
def get_block_timestamp(self):
"""
This function returns the "first-seen"-timestamp on a block, which
is specified by its blockheight. The Blockchain.info-API
does not provide the block-headers for a given blockheight.
To find the specific block, this function uses a timestamp of
a transaction which should be inserted in a block on the same day.
transaction_timestamp = (self.inputtransaction_time+14400)*1000
# second-based timestamp to millisecond-based timestamp
url = "https://blockchain.info/de/blocks/" + str(transaction_timestamp) + "?format=
json"
response = urllib.urlopen(url)
if DEBUG:
print("loading list of blocks")
try: # todo: implement a way which also loads the next day (if transaction is send
right before 0:00)
data = json.loads(response.read())
data_json = data
block_time = 0 # default-value
if DEBUG:
print("trying to find block with blockheight " + str(self.blockheight_minimum))
for block in data_json["blocks"]:
if block["main_chain"] is False: # we only search on blocks on the main-chain
continue
if block["height"] == self.blockheight_minimum:
if DEBUG:
print("block " + str(self.blockheight_minimum) + " found. blocktime: " +
str(block["time"]))
block_time = block["time"]
@staticmethod
def find_possibilities_1_forward(inputtransaction_list, output_range):
"""
:param inputtransaction_list:
:param output_range:
:return:
"""
if DEBUG:
print ("transaction-value-range: " + str(output_range))
if output_range[0] == 0 or output_range[1] == 0:
return False
value_minimum = output_range[0]
value_maximum = output_range[1]
if DEBUG:
print("checking if any transaction-value is in range: ")
participation_list = [] # list with possible participation
for transaction in inputtransaction_list: # iterate through every transaction in list
outputaddresses = transaction.get_outputaddress_list() # get transaction-list
for outputaddress in outputaddresses:
outputaddress.mixer_results_load(transaction.get_hash()) # load data for every
outputaddress
if not outputaddress.get_is_cm(): # only check addreses that are controlled by
customer
if value_minimum <= outputaddress.get_value() <= value_maximum: # check if
value in range
participation_list.append(ParticipationPossibility(
[transaction], [outputaddress], output_range, 1)
)
return participation_list
@staticmethod
def find_possibilities_2_forwards(inputtransaction_list, output_range):
value_minimum = output_range[0]
value_maximum = output_range[1]
# second iteration:
for transaction_2 in inputtransaction_list:
outputaddresses_2 = transaction_2.get_outputaddress_list()
@staticmethod
def find_possibilities_3_forwards(inputtransaction_list, output_range):
value_minimum = output_range[0]
value_maximum = output_range[1]
participation_list = [] # list with possible participation
for transaction in inputtransaction_list:
outputaddresses = transaction.get_outputaddress_list()
for outputaddress in outputaddresses:
outputaddress.mixer_results_load(transaction.get_hash())
if not outputaddress.get_is_cm():
value_first = outputaddress.get_value()
if value_first < value_maximum:
# second iteration:
for transaction_2 in inputtransaction_list:
outputaddresses_2 = transaction_2.get_outputaddress_list()
for outputaddress_2 in outputaddresses_2:
outputaddress_2.mixer_results_load(transaction_2.get_hash())
if not outputaddress_2.get_is_cm():
value_second = outputaddress_2.get_value()
if (value_first+value_second) < value_maximum:
# third iteration
for transaction_3 in inputtransaction_list:
output_addresses_3 = transaction_3.
get_outputaddress_list()
for outputaddress_3 in output_addresses_3:
outputaddress_3.mixer_results_load(transaction_3.
get_hash())
if not outputaddress_3.get_is_cm():
value_third = outputaddress_3.get_value()
value_full = value_first + value_second +
value_third
# =====================
# Deanonymizer
# =====================
class Deanonymizer:
"""
This class tries to deanonymize Coinmixer.SE-transactions by providing
possible input/out-transactions that belong to the (un)tainted coins
"""
def __init__(self):
raise Exception("Should not be initialized")
addressFee = 60000 # address-fee specified in Coinmixer.SE-FAQ
feeRange = [1, 3] # minimum and maximum service-Fee taken by Coinmixer.SE (specified in CM
.SE-FAQ)
@staticmethod
def input_deanonymize(inputtransaction_hash, cm_address=None, forwards=3, max_delay=120):
"""
This function takes the customers input-transaction to Coinmixer.SE (tainted coins
) and maps it
to output-transactions from Coinmixer.SE (untainted coins).
It tries to find the untainted coins of a given customer-transaction.
:param inputtransaction_hash: The transaction_hash of the transaction from the
customer to Coinmixer.SE
:param cm_address: The Bitcoin-Address of Coinmixer.SE which can be found in the
specified transaction.
:param forwards: The maximum amount of forwards that should be checked
:param max_delay: The maximum delay that should be checked
:return:
"""
inputtransaction = Transaction(inputtransaction_hash)
if DEBUG:
print("input transaction loaded.")
address_list = inputtransaction.get_outputaddress_list() # load address_list of
outputs
if DEBUG:
print("address_list loaded.")
found = False
output_range = []
for address in address_list:
if address.get_addresshash() == cm_address:
found = True # cm_address provided by user has been found in transaction
sent_value = address.get_value()
output_range = Deanonymizer.cm_out_ranges_calculate(sent_value)
if found is False:
return False # provided address could not be found in transaction
@staticmethod
def cm_out_ranges_calculate(input_value):
"""
Calculates the possible minimum and maximum value
which could be sent by Coinmixer.SE to the customer
The last four digits of the customers input-transactions are probably random.
The smallest fee taken by Coinmixer.Se is 1%
The highest fee taken by Coinmixer.Se is 3%
Calculations:
output_range = []
@staticmethod
def output_deanonymize():
"""
This functions should be able to map the untaint coins to the customers input-
transaction
:return:
"""
return False
# =====================
# Partition
# =====================
class Partition:
"""
The fees of Coinmixer.SE-transactions are static for a individual timeframe.
We call this timeframe "partition". This class checks, updates and handles these
timeframes.
Furthermore it shows if transactions are within a partition and shows if the
transaction likely belongs
to the Coinmixer.SE-network.
All timestamps are unix-second-based.
"""
def get_fee(self):
return self._fee
def get_time_start(self):
return self._timeStart
def get_time_end(self):
return self._timeEnd
def get_id(self):
return self._id
# =====================
# PartitionContainer
# =====================
class PartitionContainer:
"""
The PartitionContainer holds all partitions-objects.
The Container is able to insert new partitions into the Mysql-Database, update
existing partitions and
provides the fee and its reliability (trustlevel) for a given timestamp (unix-seconds-
based)
"""
def __init__(self):
raise Exception("Should not be initialized")
@staticmethod
def partition_insert(fee, timestamp_start, timestamp_end=None):
"""
Creates new partitions and inserts it into Mysql-DB
:param fee: The fee of the partition.
:param timestamp_start: Startingpoint of the partition
:param timestamp_end: Endingpoint of the partition
:return:
"""
if timestamp_end is None:
timestamp_end = timestamp_start
if DEBUG:
print ("New partition inserted. fee: " + str(fee) + " timestamp_start: " +
str(timestamp_start) + " timestamp_end:" + str(timestamp_end))
@staticmethod
def partition_timerange_update(partition_id, timestamp_start=None, timestamp_end=None):
"""
Updates an existing partition. Extends/Shortens the span of a partiton
:param partition_id: The ID of the partition which should get updated.
:param timestamp_start: The new startingpoint. Startingpoint wont get updated if not
provided.
:param timestamp_end: The new endingpoint. Endingpoint wont get updated if not
provided.
:return:
"""
if timestamp_start is None and timestamp_end is None:
return False
sql = ""
if timestamp_start is None:
sql = "UPDATE fee_partition SET " \
"timestamp_end = "+str(timestamp_end) + " WHERE partition_id = " + str(
partition_id)
if timestamp_end is None:
sql = "UPDATE fee_partition SET " \
"timestamp_start = "+str(timestamp_start) + " WHERE partition_id=" + str(
partition_id)
if DEBUG:
print ("Partition updated. partition_id: " + str(partition_id) + " timestamp_start
: " +
str(timestamp_start) + " timestamp_end:" + str(timestamp_end))
Database.sql_execute(sql)
return True
@staticmethod
def partition_load():
"""
Loads partitions from database.
:return:
"""
@staticmethod
def get_fee_and_trustlevel(timestamp, return_partition=False):
"""
This function returns the fee to a given timestamp, based on the fee-partitions.
Furthermore it shows how reliable the results are (trustlevel). A higher
trustlevel shows a better
trustworthiness of the provided fee-result.
Trustlevels:
1 -> Timestamp is between two partitions (gap). Usually the fee of a transaction
, which is between
two partitions, equals the fee of one of these partitions. However this
indicator is only usable if
gaps are not to big. Whenever fees abruptly change, this indicator will
fail.
2 -> Timestamp is in a partition. In most cases this means that the fee of a
transaction has to be the
same (+/-variance) as the partition-fee if the transaction is in the
Coinmixer.SE-network.
However this indicator could be wrong whenever the fees of cm.se-
transactions change rapidly
:param timestamp: The input timestamp. Typically the timestamp of a transaction which
should be checked.
:param return_partition: If True, the whole partition(s) and trustlevel are returned.
If False, only Fee and trustlevel are returned.
:return:
"""
if DEBUG:
print ("Getting fees and trustlevels. timestamp: " + str(timestamp) + "
return_partition: " +
str(return_partition))
list_of_earlier_partitions = []
list_of_newer_partitions = []
# divide partition in two groups: earlier (before timestamp) and later (after
timestamp) partitions
if partition.is_timestamp_newer(timestamp):
list_of_earlier_partitions.append(partition)
if partition.is_timestamp_older(timestamp):
list_of_newer_partitions.append(partition)
# partitions are diveded in two groups:
# [earlier Partitions] timestamp [later partitions]
# however each group is not orderd
if not list_of_earlier_partitions and not list_of_newer_partitions: # no partition in
database
return None
# timestamp is in the gap between two partitions: [earlier partition] timestamp [later
partition]
if return_partition is True:
return [[latest_possible_partition, earlist_possible_partition], 1]
return [[latest_possible_partition.get_fee(), earlist_possible_partition.get_fee()],
1]
# =====================
# Analyzer
# =====================
class Analyzer:
"""
"""
def __init__(self):
raise Exception("Should not be initialized")
ret = []
bla = 0
@staticmethod
def cm_check_transaction_fee_correct_partition_and_update(transaction, force=False):
"""
Check if fee is in correct partition and update partition if necessary (e. g.
update last timestamp).
Automatically updates if CMcheckTransaction_feeInCorrectParition returns 1 or 2
Updates are always applied if transaction lies within a gap:
[earlier partitions] transaction [later partition]
If force=True updates are also applied if the transaction is after or before
partitions:
[earlier partition] transaction
transaction [later partition]
:param transaction: The transaction which should be checked.
:param force: If transaction is newer/older then every partition, updates are only
applied if force == True
:return:
"""
result = Analyzer.cm_check_transaction_fee_correct_partition(transaction)
if result is None:
PartitionContainer.partition_insert(transaction.get_fee_per_byte(), transaction.
get_time())
return True
PartitionContainer.partition_load()
result = PartitionContainer.get_fee_and_trustlevel(transaction.get_time(), True) #
return:
# [parition(s), trustlvl]
if DEBUG:
print ("updating Partition is necessary. force=" + str(force))
return False
@staticmethod
def cm_check_transaction_all_inputs_cm():
"""
Should check if all input-addreses are owned by Coinmixer.SE
(not implemented yet)
:return:
"""
return False
@staticmethod
def cm_results_insert(outputadresses, transaction):
"""
Saves checking-results (output-addresses) to database.
:param outputadresses: outputaddresses which should be inserted in database
:param transaction: transaction-obejct which belongs to outputaddresses
:return:
"""
if isinstance(outputadresses, list) is False: # outputAddresses-variable could be a
single output-address
tmp = list()
tmp.append(outputadresses)
outputadresses = tmp
sql_fee = output.get_is_cm_fee()
sql_common = output.get_is_cm_common_value()
sql_version_sequence_locktime = output.get_is_cm_version_sequence_locktime()
sql_is_cm = output.get_is_cm()
sql_is_spent = output.get_is_cm_spent()
sql_is_cashin = output.get_is_cm_cashin_address()
sql_next_transaction = output.get_next_transaction()
sql_history = output.get_is_cm_transaction_history()
if sql_fee is None:
sql_fee = "NULL"
if sql_common is None:
sql_common = "NULL"
if sql_version_sequence_locktime is None:
sql_version_sequence_locktime = "NULL"
if sql_is_cm is None:
sql_is_cm = "NULL"
if sql_is_spent is None:
sql_is_spent = "NULL"
if sql_is_cashin is None:
sql_is_cashin = "Null"
if sql_next_transaction is None:
sql_next_transaction_hash = ""
else:
sql_next_transaction_hash = sql_next_transaction.get_hash()
if sql_history is None:
sql_history = "NULL"
@staticmethod
def cm_log_check_is_cashin(transaction_hash):
"""
Checks whether a transaction_hash is already saved in analysis-results and if its
a transaction done by customers
:param transaction_hash: Hash of the transaction which should be checked
:return:
"""
@staticmethod
def cm_log_check_is_errorhash(transaction_hash):
"""
Checks whether an error occured while proccessing this transactio in the past.
If an error occured, this may have an effect on the further proccessing (
endless-loop )
:param transaction_hash:
:return:
"""
sql = "SELECT count(ID) FROM errorlog WHERE transaction_hash =’" + transaction_hash +
"’"
res = Database.sql_execute(sql)[0][0]
if res == 0:
return False
else:
return True
@staticmethod
def cm_log_check_is_first(previous_hash):
"""
Checks if prevoius_hash is the first cm-network transaction
(prior transactions are customer-intputtransactions)
:param previous_hash:
:return: True -> previous_transaction is the first cm-network transaction
False-> there are cm-network transaction prior to this transaction
"""
sql = "SELECT count(is_first) FROM coinmixer_analysis_log WHERE previous_hash=’"+
previous_hash+"’"
result = Database.sql_execute(sql)
if result[0][0] == 0:
return False
else:
return True
@staticmethod
def cm_log_check(transaction_hash, forward=False):
"""
Recursive function. Follows all next-transaction-hashes (forward=True) till no
next-transaction-hash could
be found. (Returns last state of previous forward-crawling-process)
Follows all previous-transaction-hashes (forward=False) till no more previous-
transaction-hash could be
found. (Returns last state of previous backwards-crawling-process)
:param transaction_hash: The transaction-Hash from which the crawling should start
:param forward: True -> forward-crawling. False -> backwards-crawling
:return:
"""
ret = []
sql = ""
if forward is True:
sql = "SELECT next_hash FROM coinmixer_analysis_log WHERE " \
"transaction_hash = ’" + transaction_hash + "’ and previous_hash = ’’"
if forward is False:
sql = "SELECT previous_hash, is_first FROM coinmixer_analysis_log WHERE " \
"transaction_hash = ’" + transaction_hash + "’ and next_hash=’’"
result = Database.sql_execute(sql)
# check: transaction has not been processed before and no error occured and its not a
customers transaction
if not result and not Analyzer.cm_log_check_is_first(transaction_hash) and\
not Analyzer.cm_log_check_is_errorhash(transaction_hash):
return [transaction_hash]
else:
for res in list(set(result)): # multitple list-elements are removed
ret += Analyzer.cm_log_check(res[0], forward) # recursion
return ret
@staticmethod
def cm_log_update_first(transaction_hash):
""" This function should only be used with backward-crawling.
If an transaction is the the first transaction in the cm-network
(prior transaction is inputtransaction by customer) this function updates the
coinmixer_analysis_log
table for better performance (not neccessary checks of first-transactions will be
done)
:param transaction_hash:
:return:
"""
sql = "UPDATE coinmixer_analysis_log SET is_first=1 WHERE previous_hash = ’"+
transaction_hash+"’"
Database.sql_execute(sql)
@staticmethod
def cm_log_insert(transaction_hash, next_hash=None, previous_hash=None, depth=0,):
"""
Inserts new Transaction in Coinmixer-Log. Logs transactions that have been
analyzed so they dont have to
be checked again.
:param transaction_hash: Transaction hash that have been analyzed
:param next_hash: Hash of transaction that follows analyzed transaction (typically
forward-crawling)
:param previous_hash: Hash of transaction that is ahead of analyzed transaction (
typically backwards-crawling)
:param depth: The depth of recusion (only applied on backwards-crawling)
:return:
"""
if next_hash is None and previous_hash is None:
return False
if next_hash is None:
next_hash = ""
if previous_hash is None:
previous_hash = ""
Database.sql_execute(sql)
@staticmethod
def cm_check_transaction_fee_correct_partition(transaction):
"""
Checks if fee is correct (based on partitions).
PartitionContainer.partition_load()
if result is None:
return None
@staticmethod
def cm_check_address_version_sequence_locktime(address):
"""
Checks whether version, sequence, locktime of the next transaction send by the
address are correct.
If version, sequence or locktime is wrong or more then one transaction is sent
through the address
its likely not an address controlled by Coinmixer.SE.
:param address: Address to check
:return: True -> Version, Sequence, Locktime correct (could be a Coinmixer.SE-
transaction)
False -> Version, Sequence, locktime wrong (cant be a Coinmixer.SE-transaction
)
"""
return Analyzer.cm_check_transaction_version_sequence_locktime(address.
get_transactions()[0]) # check next
# transaction
@staticmethod
def cm_check_transaction_transaction_outputs(transaction, expected_outputs_int):
"""
Check if the number of Outputs of the address is expected.
Typically two outputs are expected (customer, and coinmixer.se-Network)
transactions = transaction.get_outputaddress_list()
if len(transactions) > expected_outputs_int:
return False
return True
@staticmethod
def cm_check_transaction_common_value_backward(transaction, cm_address):
"""
Checks whether an address has been RECEIVING an commonValue (typically this
addresses are owned by costumer)
or non-common values (typically this addresses are owned by coinmixer.se) in the
provided transaction.
A common value is a value thats specified up to five decimal places (e.g.
0.57312000).
All values are based in satoshis.
This function is used for backwards-crawling. Its checks if its likely that
cmAddress is an
cashin-Address (used by customers to cashin to coinmixer.SE) or its an address
which is used for outputs
(paying customers)
:param transaction: Transaction which should be a Coinmixer.SE-transaction.
:param cm_address: An output-address of the transaction that is probably owned by
Coinmixer.SE
:return:
"""
outputaddresses = transaction.get_outputaddress_list()
for outputAddress in outputaddresses:
if outputAddress.get_addresshash() != cm_address.get_addresshash(): # typicalle
the second output-address
if outputAddress.get_value() % 1000 == 0: # last three decimal places equals
zero
return True
else:
return False
return True # default (e.g. only 1 output-address)
@staticmethod
def cm_check_transaction_version_sequence_locktime(transaction):
"""
Checks whether version, sequence and locktime are correct for coinmixer.SE-
transactions
:param transaction: Transaction that should be checked.
:return:
"""
if transaction.get_version() == 2 and transaction.get_locktime() > 0 \
and transaction.get_sequence() == 4294967294:
return True
return False
@staticmethod
def cm_check_address_common_value(address):
"""
Checks whether an address has SENT an commonValue (typically this addresses are
owned by costumer)
or non-common values (typically this addresses are owned by coinmixer.se)
A common value is a value thats specified up to five decimal places (e.g.
0.57312000).
All values are based in satoshis.
This function is used for forward-crawling. Its checks if its likely that the
address is a customer-address.
:param address:
:return:
"""
if address.get_value() % 1000 == 0:
return True
else:
return False
@staticmethod
def cm_check_address_transaction_count(address, inputtransaction):
"""
Checks whether the address has been sending transaction before the input-
transaction.
Coinmixer-Addresses typically do only send a single transaction.
:param address: The address to check.
:param inputtransaction: the known input-transaction.
:return: 0 -> input-Transaction is not the first transaction of address or there
have been
more then one send-transaction
-> address MOST PROBABLY NOT controlled by Coinmixer.SE
2 -> address has no unspent outputs and inputtransaction is the first and
only spent output
-> address IS PROBABLY controlles by Coinmixer.SE
"""
# =====================
# Mapping
# =====================
class Mapping:
"""
The database-structure saves address-hashes and values (spent-outputs) in seperated
tables.
This class provides a mapping between address-hashes and values.
Values and addresses are mapped through their indices (e.g. addr1 and val1 belong
together)
"""
def __init__(self):
self._addressList = [] # list of addresses [addr1, addr2, ...]
self._value_list = [] # list of values [val1, val2, ...]
self._id = None
def mapping_insert(self):
"""
Inserts mapping into database
:return:
"""
sql = "INSERT INTO address_and_value_mapping (address_list, value_list) VALUES " \
"(’"+json.dumps(self._addressList)+"’, ’"+json.dumps(self._value_list) + "’)"
Database.sql_execute(sql)
sql = "SELECT LAST_INSERT_ID()"
self._id = Database.sql_execute(sql)[0][0]
def get_id(self):
return self._id
# =====================
# Address
# =====================
class Address:
"""
Represents a Bitcoin-address with its general Bitcoin-address-information
and specific Coinmixer-information.
"""
API_URL = "https://blockchain.info/rawaddr/"
return inputtransaction
data = result[0]
self._is_cm_forward = bool(data[0])
self._is_cm_fee = data[1]
self._is_cm_common_value = data[2]
self._is_cm_version_sequence_locktime = data[3]
self._is_cm_connected = data[4]
self._is_cm_spent = data[5]
self._is_cm = data[6]
self._is_cm_cashin_address = data[7]
self._is_cm_transaction_history = data[8]
return True
def get_is_cm_counter(self):
"""
Returns CM-Counter (see is_cm_counter_add(add)) for more information
:return:
"""
return self._is_cm_counter
def get_next_transaction(self):
"""
Returns next transactions that was sent by Coinmixer.SE (forward-crawling)
:return:
"""
return self._nextTransaction
def get_is_cm(self):
"""
Getter. For more informatin check set_is_cm(is_cm).
:return:
"""
return self._is_cm
def get_is_cm_fee(self):
"""
Getter. For more information check set_is_cm_fee(CMfee).
:return:
"""
return self._is_cm_fee
def get_is_cm_transaction_history(self):
"""
Getter. For more information check set_is_cm_transaction_history(CMhistory)
:return:
"""
return self._is_cm_transaction_history
def get_is_cm_version_sequence_locktime(self):
"""
Getter. For more information check set_is_cm_version_sequence_locktime(verSegLock).
:return:
"""
return self._is_cm_version_sequence_locktime
def get_is_cm_spent(self):
"""
Getter. For more information check setis_cm_spent(CMspent).
:return:
"""
return self._is_cm_spent
def get_is_cm_common_value(self):
"""
Getter. For more information check set_is_cm_common_value(CMcommon).
:return:
"""
return self._is_cm_common_value
def get_transactions(self):
"""
Getter. For more information check transactions_load()
:return:
"""
return self._transactions
def transactions_load(self):
"""
Loads every transaction of the address from Blockahin.info-API
(todo: check if all transactions for an address are
already saved in database and load them from there)
:return:
"""
if DEBUG:
print ("loading all transactions")
url = Address.API_URL + self._addresshash
response = urllib.urlopen(url)
try:
data = json.loads(response.read())
transaction_list = []
for transactionData in data["txs"]:
transaction_list.append(
Transaction(
transactionData["hash"], None, None, None, None, None,
None, None, None, None, None, None, transactionData))
self._transactions += transaction_list
if DEBUG:
print("all transactions have been loaded.")
except ValueError:
print("JSON-object could not be decoded. Probably your IP got blocked. Try again
later.")
exit(0)
def first_transaction_timestamp(self):
"""
Returns the timestamp of the first transaction that has been sent/received through
this address.
Transactions have to be loaded before calling this function! (transactions_load)
:return:
"""
timestamp = self._transactions[0].get_time()
for transaction in self._transactions:
if transaction.get_time() < timestamp:
timestamp = transaction.get_time()
return timestamp
return sent_counter
def get_addresshash(self):
return self._addresshash
def get_sequence(self):
"""
List of all sequences used in transactions???
:return:
"""
return self._sequence
def get_address_id(self):
"""
Getter. For more information check set_address_id(address_id)
:return:
"""
return self._address_id
def get_value_id(self):
"""
Getter. For more information check set_value_id(value_id)
:return:
"""
return self._value_id
def get_value(self):
"""
Getter. For more information check set_value_id(value_id).
:return:
"""
return self._value
def is_inputaddress(self):
"""
True -> address is loaded as an input-address to an transaction
False -> address is loaded as an output-address to an transaction
:return:
"""
return self._is_inputaddress
def get_spent(self):
"""
Getter. Returns the value (satoshis) spent by this address
:return:
"""
return self._spent
def get_previous_transaction(self):
"""
Getter. For more information check set_previous_transaction(transaction)
:return:
"""
return self._previous_transaction
def get_is_cm_cashin_address(self):
"""
True -> Address is an Coinmixer.SE-Address which is used by customers to use
Coinmixer.SE-Service
False -> Address used by Coinmixer to forwards transactions
:return:
"""
return self._is_cm_cashin_address
# =====================
# Transaction
# =====================
class Transaction:
"""
Represents a Bitcoin-transaction with its general
Bitcoin-transaction-information and specific Coinmixer-information.
"""
API_URL = "https://blockchain.info/de/rawtx/"
def __init__(
self, tx_hash, tx_id=None, blockheight=None,
fee=None, size=None, time=None,
version=None, sequence=None,
locktime=None, inputaddress_list=None,
outputaddress_list=None, is_cm=None, data=None):
self._hash = tx_hash
self._blockheight = blockheight
self._fee = fee
self._size = size
self._time = time
self._version = version
self._sequence = sequence
self._locktime = locktime
self._inputaddress_list = inputaddress_list
self._outputaddress_list = outputaddress_list
self._is_cm = is_cm
self._id = tx_id
self._forward = True
self._block_full = False # not used at the moment
if DEBUG:
print("new transaction-object has been created: " + self._hash)
# todo: after loadFromDatabase() check if data could be loaded from normalsized/
bigsized-transaction-table
# todo: (change data structure of normalsized/bigsized-transaction-table previously if
necessary)
# todo: maybe the load_from_api(data) function can be used (data from normalsized/
bigsized-table)
# if an object is created and only the transaction-hash is passed,
# the transaction will be loaded from the database or blockchain.info-API
if all(parameter is None for parameter in [
blockheight, fee, size, time,
version, sequence, locktime,
inputaddress_list, outputaddress_list,
is_cm
]):
def load_from_database(self):
"""
Loads transaction-data from database (transaction_data-table).
:return:
"""
if DEBUG:
print ("loading transaction-data from database")
if not result: # transaction-hash has never been seen before by this script
if DEBUG:
print ("couldnt load transaction-data. Not in database.")
return False
if result[0][2] == 1: # transaction has been processed already and is saved to
database
if self._sequence is None:
sql = "SELECT sequences FROM multiple_sequences where transaction_id = " +
str(self._id)
self._sequence = json.loads(Database.sql_execute(sql)[0][0])
address_id_list = json.loads(result[0])
value_id_list = json.loads(result[1])
if DEBUG:
print ("faild to load transaction-data.")
return False
if data is None:
url = Transaction.API_URL + self._hash
response = urllib.urlopen(url)
try:
data = json.loads(response.read())
except ValueError:
print("JSON-object could not be decoded. Probably your IP got blocked. Try
again later.")
exit(0)
try:
if DEBUG:
print ("successfully loaded transaction-data from API.")
self.save_to_database(data) # save data to data
# base
transaction_full_json_encoded = json.dumps(transaction_full_json)
tablename = "transactions_size_normal"
bigsized = False
def check_previous_addresses(self):
"""
Checks if previous-addresses (backward-crawling) are cash-In-Addresses
(Coinmixer-Addresses which are used by customers to cash-in) or
coinmixer-addresses which are used to cashout-customers.
This function should only be called when transaction is confirmed as an coinmixer-
transaction.
This function is the mainly used to determine which address should be checked next
(backwards-crawling)
:return:
"""
if DEBUG:
print ("Addreses to check: " + str(self._inputaddress_list))
for inputAddress in self._inputaddress_list:
if DEBUG:
print("checking address: " + inputAddress.get_addresshash())
inputAddress.transactions_load()
inputAddress.set_is_cm(True) # since its an CM-transaction, each input has to be
controlled by coinmixer.SE
inputAddress.set_is_cm_spent(True) # since its an input, it has already been spent
(in this transaction)
inputAddress.set_is_cm_cashin_address(True)
continue
if DEBUG:
print("inputtransaction found: " + inputtransaction.get_hash())
print("checking version, sequence, locktime of inputtransaction:")
inputAddress.set_previous_transaction(inputtransaction)
version_sequence_locktime_bool = Analyzer.
cm_check_transaction_version_sequence_locktime(inputtransaction)
if DEBUG:
print("result: " + str(version_sequence_locktime_bool))
inputAddress.set_is_cm_version_sequence_locktime(version_sequence_locktime_bool)
if DEBUG:
print("checking number of transactionsoutputs equals 2")
transaction_counter = Analyzer.cm_check_transaction_transaction_outputs(
inputtransaction, 2)
if DEBUG:
print("result: " + str(transaction_counter))
if DEBUG:
print("checking common-value:")
common_value_bool = Analyzer.cm_check_transaction_common_value_backward(
inputtransaction, inputAddress)
if DEBUG:
print("result:" + str(common_value_bool))
if common_value_bool is True:
# CMcounter +=1
inputAddress.set_is_cm_common_value(True)
else:
inputAddress.set_is_cm_common_value(False)
if DEBUG:
print ("checking fee:")
fee_check = Analyzer.cm_check_transaction_fee_correct_partition(inputtransaction)
if DEBUG:
print("Fee-check: " + str(fee_check))
if DEBUG:
print("full result: its a address used for customer-cashins")
if inputAddress.get_is_cm_common_value() is True or inputAddress.get_is_cm_fee()
is True:
if DEBUG:
print("full result: its probably NOT an address used for customer-cashins")
inputAddress.set_is_cm_cashin_address(False) # if common value or fee-check
True -> its probably not
# an cashin-address
else:
if DEBUG:
print("full result: its probably an address used for customer-cashins")
inputAddress.set_is_cm_cashin_address(True)
def check_next_addresses(self):
"""
This function checks whether the next-address is an address which is probably
owned by the coinmixer.se
or its an address which is owned by customers.
Differentiation whether address is coinmixer-address or customer-address is based
on a counter.
Counter-Rules:
There should only be two addresses in output-list.
The address with the highest counting-result is most probably an address
controlled by Coinmixer.SE.
The address with the lower counting-result is most probably an address
controlled by the customer.
More then 1 transaction sent from address -> counter = -1 (no further counting
applied)
Version, locktime, sequence not correct -> counter = -1 (no further counting
applied)
Unspent outputs availible -> counter += 0
All outputs spent -> counter += 1
Received an uncommon-value -> counter += 2
Next transaction uses correct fee -> counter += 3
Next transaction uses correct fee and is in partition (trustlevel = 2) -> += 1
:return:
"""
if DEBUG:
print ("Addreses to check: " + str(self._outputaddress_list))
for output in self._outputaddress_list:
if DEBUG:
print("Address to check: " + output.get_addresshash())
output.transactions_load()
if DEBUG:
print("Analyzing address:\n Analyzing transaction-count")
# transactionCount = 0 -> not a Coinmixer-Address (multiple spents) (strong
indicator)
# transactioncount = 1 -> probably not a Coinmixer-Address (unspent) (low
indicator)
# transactioncount = 2 -> could be a Coinmixer-address (spent) (low indicator)
transaction_count_result = Analyzer.cm_check_address_transaction_count(output,
self)
if DEBUG:
print("transaction-count result: " + str(transaction_count_result))
print ("checking common value:")
# True -> common-value sent (low indicator)
# False -> uncommon-value sent (low indicator)
common_value_bool = Analyzer.cm_check_address_common_value(output) # True ->
common value (LI)
if DEBUG:
print("common-value result: " + str(common_value_bool))
print("checking version, sequence, locktime")
# True -> version, sequence, locktime ok (low indicator)
# False -> version, sequence, locktime not ok (strong indicator)
version_sequence_locktime_bool = Analyzer.
cm_check_address_version_sequence_locktime(output)
if DEBUG:
print("version, sequence, locktime result: " + str(
version_sequence_locktime_bool))
fee_check = None
output.set_is_cm(None)
# set next-transaction of address and apply fee-check if possible
if transaction_count_result == 2:
sent_count_result = output.sent_counts(True)
if sent_count_result[0] == 1:
nexttransaction = sent_count_result[1][0]
fee_check = Analyzer.cm_check_transaction_fee_correct_partition(
nexttransaction) # fee-check:
# next tx
output.set_next_transaction(nexttransaction)
output.set_next_transaction(None)
output.set_is_cm(None) # default-value
if DEBUG:
print("results transaction-check:")
if DEBUG:
print("Analysis of transaction-count faild ")
print ("Counter: " + str(output.get_is_cm_counter()))
continue
elif transaction_count_result == 2: # no unspent output available
output.set_is_cm_spent(True)
output.is_cm_counter_add(1)
output.set_is_cm_transaction_history(2)
if DEBUG:
print("spent: True ")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("spent: False ")
print ("Counter: " + str(output.get_is_cm_counter()))
continue
if version_sequence_locktime_bool is False: # version, sequence, locktime wrong (
as coinmixer.SE-address)
output.set_is_cm(False)
output.set_is_cm_version_sequence_locktime(False)
output.set_is_cm_counter(-1)
if DEBUG:
print("Analysis of version, sequence, locktime faild. aborting. ")
print ("Counter: " + str(output.get_is_cm_counter()))
continue
elif version_sequence_locktime_bool is True: # vers., sequence, locktime correct (
as coinmixer.SE-address)
if DEBUG:
print("Analysis of version, sequence, locktime ok. ")
print ("Counter: " + str(output.get_is_cm_counter()))
output.set_is_cm_version_sequence_locktime(True)
if DEBUG:
print("common-value: False ")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("common-value: True")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("fee-check: OK (prob. last transaction) ")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("fee-check: good (in partition or gap)")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("fee-check-bad ")
print ("Counter: " + str(output.get_is_cm_counter()))
if DEBUG:
print("checking done. ")
print ("Counter: " + str(output.get_is_cm_counter()))
:param forward:
:param depth:
:return:
"""
if DEBUG:
print ("crawling. forward: " + str(forward) + " depth: " + str(depth))
if depth == 0 and forward is False: # recursion will only executed till depth == 0
if DEBUG:
print("maximum depth reached. stop crawling this path")
return True
if DEBUG:
print("loading and saving address-data(value/hash-mapping/sequences). Transaction:
" + self.get_hash())
# create mapping for addresses and values
inputaddress_mapping = Mapping()
outputaddress_mapping = Mapping()
inputaddress_mapping.mapping_insert()
outputaddress_mapping.mapping_insert()
tmp_sequence = self._sequence
if type(self._sequence) == list:
tmp_sequence = "NULL"
Database.sql_execute(sql)
sql = "SELECT LAST_INSERT_ID()"
self._id = Database.sql_execute(sql)[0][0]
# updates list which holds every transaction-data that has been seen by the crawler
sql = "UPDATE list_of_all_transaction_hashes SET in_transaction_data = 1 " \
"WHERE transaction_hash = ’"+self._hash+"’"
Database.sql_execute(sql)
if DEBUG:
print("data saved in database.")
# Backwards-Crawling
if self._forward is False:
if DEBUG:
print("backward-crawling starts.")
checkresult = self.check_previous_addresses() # checks previous-addresses:
# Every previous-address should be controlled by coinmixer.SE
# however some of these addresses are cashin-addresses
# which are used by customers to cash-in bitcoins
# which are going to be "anonymized" and
# others are addresses which are used by coinmixer to cashout "anonymized" coins
to customers
if checkresult is False: # at least one input_address-address seems not to be
controlled by coinmixer.SE
ErrorLog.log(
"Error occured on transaction: " +
self.get_hash() +
" (transaction classified as Coinmixer.SE-transaction but at least one"
" input_address-address seems not to be owned by Coinmixer.SE)", self.
get_hash()
)
return False
Analyzer.cm_check_transaction_fee_correct_partition_and_update(
input_address.get_previous_transaction(), True
)
prev_transaction = input_address.get_previous_transaction()
list_of_previous_transactions.append(prev_transaction)
if input_address.get_is_cm_fee() is None:
PartitionContainer.partition_insert(
prev_transaction.get_fee_per_byte(), prev_transaction.get_time()
)
found = False
for inputaddress in self._inputaddress_list:
if inputaddress.get_addresshash() == prev_out.get_addresshash():
found = True
if found is False:
prev_out.set_is_cm(False)
previous_transaction.set_forward(False)
Analyzer.cm_results_insert(prev_out, previous_transaction)
# recursion
for previous_transaction in list_of_previous_transactions:
if DEBUG:
print("recursion. depth: " + str((depth-1)))
previous_transaction.set_is_cm(True)
previous_transaction.insert_into_cm_network(False, depth - 1)
if DEBUG:
print ("maximum-depth reached or path has been fully analyzed. Try another path")
# forward-crawling
if self._forward is True:
self.check_next_addresses() # checks whether next address is coinmixer-address or
customer-address
next_address = None
non_cm_addresses = []
for output in self._outputaddress_list:
if next_address is None:
next_address = output
continue
if output.get_is_cm_counter() > next_address.get_is_cm_counter():
non_cm_addresses.append(next_address) # address with highest counter is
probably next cm-address
next_address = output
if DEBUG:
print ("Next Coinmixer-address: " + str(next_address))
print ("Next non-Coinmixer-address: " + str(non_cm_addresses))
if DEBUG:
print("Next transaction: " + next_transaction.get_hash())
next_transaction.set_is_cm(True)
next_address.set_is_cm(True)
Analyzer.cm_check_transaction_fee_correct_partition_and_update(next_transaction,
True)
def get_hash(self):
return self._hash
def get_id(self):
return self._id
def get_blockheight(self):
return self._blockheight
def get_fee(self):
return self._fee
def get_fee_per_byte(self):
return int(self._fee/self._size)
def get_time(self):
return self._time
def get_version(self):
return self._version
def get_sequence(self):
return self._sequence
def get_locktime(self):
return self._locktime
def get_inputaddress_list(self):
return self._inputaddress_list
def get_outputaddress_list(self):
return self._outputaddress_list
def get_is_cm(self):
return self._is_cm
def get_size(self):
return self._size
def get_forward(self):
return self._forward
# =====================
# Network
# =====================
class Network:
"""
Initiate the crawling-processes. Checks if first transaction provided is a Coinmixer-
transactions, restores
last crawling-processes, prints out network-graphs (matlab-commands)
"""
def __init__(self):
raise Exception("Should not be initialized")
@staticmethod
def load():
"""
Not implemented yet.
:return:
"""
return False
@staticmethod
def transaction_crawling(inputtransaction_hash, forward=True, depth=5):
"""
Starts crawling-process. Default: forward-crawling.
Depth-parameter will only be used for backwards-crawling.
Transaction provided by user has to be a
Coinmixer.SE-transaction (version, sequence, locktime, fee will be checked).
Warning: Fee-partition will be forcefully created!
Further crawling-processes may generate wrong results if non-conmixer.SE-
transaction
is used as input-transaction.
:param inputtransaction_hash:
:param forward:
:param depth:
:return:
"""
print ("Start crwaling: hash: " + str(inputtransaction_hash) + " forward: " +
str(forward) + " depth: " + str(depth))
if DEBUG:
print ("loading old results:")
res = Analyzer.cm_log_check(inputtransaction_hash, forward) # checks if hash has
already been processed
if not res:
print ("No transaction-hash found that could be used for crawling! "
"(last hash produced error or is a cashin-transaction)")
else:
for inputtransaction_hash in res: # transactions provided by user have to be
Coinmixer.SE-transactions
if DEBUG:
print("Analyzing transaction: " + inputtransaction_hash)
inputtransaction = Transaction(inputtransaction_hash)
if DEBUG:
print("Analyzing version, sequence, locktime of transaction: ")
if Analyzer.cm_check_transaction_version_sequence_locktime(inputtransaction) is
False:
if DEBUG:
print("Analyzing faild. Logging error.")
ErrorLog.log(
"Transaction provided by user most probably not a coinmixer.SE-
transaction. Hash: "
+ inputtransaction_hash + "(Sequence, Locktime, Version wrong)",
inputtransaction_hash)
continue
if DEBUG:
print ("version, sequence, locktime ok.\nchecking if fee is ok:")
result = Analyzer.cm_check_transaction_fee_correct_partition_and_update(
inputtransaction, True) # force
if DEBUG:
print ("version, sequence, locktime result: " + str(result))
if result is True:
inputtransaction.set_is_cm(True)
inputtransaction.insert_into_cm_network(forward, depth) # start crawling
else:
ErrorLog.log("Wrong fee in transaction provided by user : "
+ inputtransaction_hash +
" (fee inconsistent with fee-partitions)", inputtransaction_hash
)
@staticmethod
def blockwise_crawling():
"""
Not implemented yet.
:return:
"""
return False
@staticmethod
def graph_show():
"""
Not implemented yet.
:return:
"""
return False