Bitcoin Source Code Walkthrough

This is a text describing software source code written in the C++ programming language. It is advanced and not easy to understand. If you’ve come here by mistake or want to learn more about my teaching of programming you need this document here (link to: README document in Student/Teaching Materials section). If instead you are looking to start investing in bitcoin you need this document here (link to: Cryptocurrency investment system). Also consider joining one of my crypto hedge fund and/or programming students groups on Telegram to get faster updates to what we’re doing in our crypto-space.

Introduction

This is a work in progress. It is changing rapidly. Come back often to see what new stuff I’ve added.

In 2009 Satoshi Nakamoto published his paper on bitcoin and I went nuts. The code base was small (about 9000 lines) and fairly straightforward. I dived into it and fell [totally] in love. Over the years many students have asked me to explain how various parts of the code work and this has become almost a full-time pursuit of mine. Teaching others has always been a big part of my life. If you’ve got this far through my teaching courses you are ready to tackle this ultimate bit now. The goal of this project/book is to have prospective blockchain developers gain a deep understanding of bitcoin at the source code level. The bitcoin version we are going to study version 0.1.5 was originally implemented by Satoshi Nakamoto. Why this version and not a more modern (current at the time of this writing is 0.17.1)? It only has about 9000 lines of source files and 6000 lines of header files. Most of the code is still relevant in bitcoin’s codebase today (2018), which makes this code reading project easier to handle and almost an abridged version of the main code base in use today. Because of its small size and relative lack of sophistication It lends itself to teaching and puts trained developers ‘out there’ that much sooner with the ability to jump into the current code base with only a little more effort.

Prerequisites

You need to know some programming to get the most out of this reading. If you have no programming language exposure at all you’re going to have a very tough time understanding any of this document so jumping in here with the intention of learning the internals of bitcoin is not a good idea. So what is the best way to get through this? Go through my programming course in its entirety. That means (if you’re following my suggested path to a deep knowledge of programming), you start with Zed Shaw’s Learn Python The Hard Way and work your way through Harvard’s CS50 course and then branch here into more C/C++ stuff.

Then here is a list of C/C++ reference books that you should keep with you and refer to as you go through the bitcoin source finding stuff you don’t understand. Of course, throughout your IT career/hobby GIYF – Google Is Your Friend. Try google first, then dive into the books.

As usual, and I can’t emphasize this enough, if you want to become a programmer start at the beginning of these books, setup your programming environment and key in every single program you come across in these books and on the web in your queries, searches and general research. That is THE ONLY WAY to learn how to program. If you do that and get through these three books here you’ll be one helluva programmer. If you do that … get in touch with he ASAP … there’s almost certainly something very lucrative we can achieve together.
[pdf. 1Mb] Kernighan, Ritchie – The C Programming Language, 2nd edition

References

Below are my reference sources. These are key to understanding what bitcoin is all about. Peppered throughout this text are references into these texts/videos/websites as well as links to relevant portions. Don’t skip these instructions if you want to get a full understanding of bitcoin. Follow the links and read, read, read. Put the reading away and come back to it later. Keeping doing this until you understand fully and have digested ALL the ideas. Then come back to the code. Remember that the code is the final arbiter — the final ‘source’ <pun> the ‘one ring that rules them all.’

These references are valuable though not necessary to go through in their entirety before reading the source. Jump into the source and follow along. Take your time. Go back and forth between the source code and these references. Look at the diagrams I have generated via Doxygen. Generate them yourself if you like. As I’ve done that work for you, you could leave that task for later once you’re a little more familiar with the code base and want a reference tool you’ve generated on your hard disk yourself for the feeling of greater control that that will give you.

  1. Original Satoshi Nakamoto bitcoin paper here >> https://bitcoin.org/bitcoin.pdf
    1. Harrison Kinsely (YouTube handle SentDex) has done a sterling job dissecting this paper line by line. Have a look at his playlist here and work through all the videos in that playlist with a printed copy of the paper by your side.
  2. Andreas Antonopoulos’s Mastering bitcoin book here >> https://github.com/bitcoinbook/bitcoinbook/
    1. Elliptic curves are the ‘bread and butter’ of bitcoin. Andrea Corbellini can help you understand them much more deeply than Andreas’s work. Here are Corbellini’s series of blog posts aimed at giving you a gentle introduction to the world of elliptic curve cryptography. This is not to a complete and detailed guide to ECC (the web is full of information on the subject), but to provide a simple overview of what ECC is and why it is considered secure, without losing time on long mathematical proofs or boring implementation details.
  3. The Bitcoin Wiki is here >> https://en.bitcoin.it/wiki/Main_Page

Open all of these in their own tabs so you can have them open and refer to them from time to time. Bitcoin 0.1.5 only compiles on Windows. I have been able to compile it successfully, but it took hours to sort out the dependent libraries and the makefile. Compiling the project isn’t really needed as our goal is to understand the code, not to tweak or debug. So don’t spend time compiling it unless you are really inclined to.

Let’s begin:

Code

Prerequisites

  1. Download the code here >> https://github.com/bitcoin/bitcoin/tree/v0.1.5
  2. Use of github is beyond the scope of this document. Learn to use github here >> https://lab.github.com/
  3. Clone the above repository.
  4. If you want to follow along with what I’ve done you will need doxygen. Download Doxygen here >> http://www.doxygen.nl/
  5. Use of doxygen is beyond the scope of this document. Learn to use it here >> https://www.star.bnl.gov/public/comp/sofi/doxygen/starting.html

Overview

When a bitcoin application starts, it does the following:

  • Loads the data from database to memory.
  • Draws the main application window.
  • Starts a local bitcoin node that includes 3 long-running threads.
  • Starts the mining thread.

The initial data loading makes sure that bitcoin is restored to its last-known state before it starts to catch up with the missing data.

The main application window is what we call a bitcoin wallet today. It can be used to create bitcoin addresses and send bitcoins. Most people are not using this wallet today but other, 3rd party wallets like >> blockchain.info, Jaxx or Mycelium.

The long running threads start a p2p communication network, where our local node exchanges messages with peers to make sure transactions go through and consensus is reached.

The mining thread starts the process of creating new blocks, proposing blocks to the network, and getting reward for doing this. All machines serving as nodes are not miners. There are some machines that are nodes and do no mining. They are even more important than the miners. I run a Raspberry Pi based node which does no mining (Here is a video showing the compilation process (sorry but that process is beyond the scope of this document): https://youtu.be/U2ucVhc1v_w, and this one showing it running after compilation completes: https://youtu.be/cY9bqmIW_S0). Nodes like mine here build consensus by verifying transactions. This video presentation by Andreas is a useful lesson on this >> https://youtu.be/fNk7nYxTOyQ. You can find my node among the many modes worldwide here >> https://bitnodes.earn.com/.

The Raspberry Pi is an amazing machine. I don’t have time to get into all that it can do but take it from me, I was able to start my computer career nay my computer life only because of small, cheap computers like the RasPi. At the time these were the Sinclair and Commodore machines and today the RasPi is so, so much more. Here … this book tells a whole story better than I ever can. And you can keep up with RasPi development through their website. I download and read the MagPi publication as often as it comes out … so much fun.

Chapter 3 of Andreas’s book covers setting up a bitcoin node, compiling the code and interrogating it to verify transactions as he talks about in the video linked to in the preceding paragraph. Here is a link to that chapter of his book >> https://github.com/bitcoinbook/bitcoinbook/blob/develop/ch03.asciidoc and to learn exhaustively about nodes look here >> https://en.bitcoin.it/wiki/Full_node.

The Code

The code comprises 12 files, 12, 957 lines of code, 744 symbols and 865 relations. Not too big and with my explanations below and the references I’ve provided you’ll soon have a good grasp of what this code is doing.

The files are:

Let’s dive into the code that starts the ball rolling. That code is in file

// Define a new application
class CMyApp: public wxApp
{
  public:
    CMyApp(){};
    ~CMyApp(){};
    bool OnInit();
    bool OnInit2();
    int OnExit();

    // 2nd-level exception handling: we get all the exceptions occurring in any
    // event handler here
    virtual bool OnExceptionInMainLoop();

    // 3rd, and final, level exception handling: whenever an unhandled
    // exception is caught, this function is called
    virtual void OnUnhandledException();

    // and now for something different: this function is called in case of a
    // crash (e.g. dereferencing null pointer, division by 0, ...)
    virtual void OnFatalException();
};

At this juncture we rotate about the class CMyApp which is as in the image at above. The image gives you an overview of the class and below are the actual text definitions and declarations. CMyApp is declared as follows:

Congratulations on following along this far with me. Now you’ve seen what I am doing which is to follow the call stack of a function getting a basic feel for what is being done, following the execution path and slowly winding my way deeper and deeper down the rabbit hole. Of course there will cross-linking and you will not always read in a linear order. The stack may very deep, but do try not to get lost. If you do get lost don’t fret. Put this work away for a while, go for a walk or take a short nap. Come back later and start somewhere in the document where you were not lost and wind your way down the rabbit hole once again. You should find that you manage to get just a little bit deeper this new trip and then get lost in detail once more. Rinse and repeat and keep going like this. You’ll soon have traversed the entire code base quite a few times and will be getting a real feel for what is going on and the magic of what Satoshi has done will hit you hard!

Onward …

Bitcoin addresses and fundamentals

Now let’s look at how look into how a bitcoin address is created. More importantly, we will use bitcoin address as an example to introduce 3 building blocks of bitcoin:

  • Serialization
  • Database
  • Cryptography

To get a solid grounding in what’s going to be discussed here read Andreas’s Chapter 4. You might have to read that chapter through several times and come back here when you’re done. But do make sure you read it and digest it well or else you won’t be able to follow what the code in this section is doing.