Hi there Thomas, just a quick question concerning the status of this project? Is it done and how did it turn out? I feel this is actually a very important project for the future of Monero and I was just curious on the status. Also, is it possible to post an example of what you are doing, so interested people can get an idea of what is taking place?
Documentation and cleanup of source code
What
Clean up and properly document (doxygen) the monero source code, and handle any merges necessary to do so seamlessly with other concurrent development efforts.
Who
I'm Thomas Winget, and I've been contributing to the Monero codebase since around June 2014. My largest contribution is the migration from storing the blockchain in a static, binary format to storing the blockchain using a database API, as well as two BlockchainDB implementations (LMDB and BerkeleyDB). Other contributions of mine can be found via the network graph on github or simply looking at my fork.
As a side effect of the work I've done already on Monero, I'm rather familiar with most of the codebase. This puts me in a good position to write down how everything works and fits together, as I've had to sort that out already for many parts.
As a side effect of this effort, I will further cement the knowledge I have of the codebase and gain knowledge in areas I'm less familiar with, enabling me to efficiently implement new features in the future and guide others who may have questions regarding a specific module. In addition, documenting the code will function as a sort of code review, allowing the possibility of spotting any bugs (minor or otherwise).
Why
In my time developing on Monero, there have been countless instances where I need to reference what another piece of code does in order to work with it, and having all the code documented properly in an easily accessible manner (doxygen) would be helpful for that for anyone working in Monero's codebase. I have also come across areas where the code could be made more efficient or clear without changing any functionality, as well as many typographical errors (though these are minor, I think we can all agree they're annoying if nothing else).
Documentation is by far not a glamorous task (probably why it's a bit sparse in Monero), but every development effort benefits immensely from properly documented code.
Proposal and Milestones
I would like to spend about 20 hours per week working on the above, at a rate of 45 XMR per hour (just over $20/hr USD at the time of writing this), paid out every week. I am unsure whether or not 80 hours of work will be enough to cover the entire codebase. There are over 70,000 lines of source in the src/ folder alone, and there is code in other folders that brings the total to over 100k, so this is no small task!
Milestones would be as follows:
- First 20 hours (900/3600 XMR)
- Second 20 hours (1800/3600 XMR)
- Third 20 hours (2700/3600 XMR)
- Fourth 20 hours (3600/3600 XMR)
After each milestone (and during), I will be available for comment, so anything that seems unclear in my documentation can be addressed and dealt with accordingly.
Ok, final 20 hours complete at long last. Really a bit longer, but I'm still not anywhere near "done". I just likely won't attempt to do so much at a time again... >_>
Much of this time was spent rebasing/cherry-picking the previous milestones' commits, which is why above I said "really a bit longer", as that didn't quite seem fair. I'll be doing more yet, but as I'm over 20 hours by a bit and just wrapped up a section, I thought it's time for an update. New developments follow below.
Transaction pool documentation:
commit c6bb201a07d84e195be901acda2fcd0b886bf66f
Author: Thomas Winget
Date: Thu Mar 24 20:03:02 2016 -0400
Transaction pool documentation (and some cleanup)
tx_pool.h doxygen documentation completed.
Many notes made on areas for improvement, be that functionality or
code clarity.
Commented code and unused code removed.
As the pull request for the previous milestones is still pending, I have not yet submitted this as a pull request. The changes can be found here on my github.
Testnet documentation:
This documentation isn't something that will go in the source code directly, and thus not on the github. I went through every instance of "testnet" in the source to evaluate what "testnet or not" means. I could probably make the resulting document look nicer, but it should be clear enough. While this is not directly documenting the source code, it will lead to better and ideally self-documenting code. For now, I'll simply paste the document below, but if anyone has any questions I'll be glad to answer them. I've also uploaded it to fpaste because the preview shows that it's ugly on here.
For reading it:
The organization is source lines followed by their purpose, followed by a line of hyphens, repeat for all purposes of the notion of "testnet". Source lines which are grouped together without a line between are directly related to one another. The exception to this is the "pass the testnet flag around" section, as it didn't seem useful to group them in this manner.
EDIT: The formatting the preview showed broke into fragile pieces, apparently, so just look at it on fpaste.
Okay, long delay since the last milestone update, but here's #3:
BlockchainDB's documentation is now Doxygen-compliant and up-to-date with recent changes. A couple of things were renamed, as their purpose had changed but their names implied the old purpose.
It's amazing how much things change in under a year. BlockchainDB has undergone many changes, so the new documentation is not guaranteed to be 100% correct. I'll be attaching my notes to the end of this post, some of them refer to bits I still need to ask about (or spend time looking into). I'll also paste the commit logs at the end, because why not.
If anyone has thoughts on my notes below, please don't hesitate, as many of them I'm not 100% sure what the best course of action is at this time.
As always, you can follow updates on my github repo.
Current TODO list
General:
Discuss updating function names to better reflect their purpose.
Fix inconsistent indentation.
evaluate return values of all functions (lots of bool return values not being used, etc.) related to this, consider throwing rather than returning empty containers where a container is the return type
src/
cryptonote_core/
cryptonote_core.*/
move BlockchainDB initialization and DB-specific code from cryptonote_core.* to better places
blockchain.*/
evaluate locking mechanism and possibility of separate reader/writer locks
blockchain_db/
blockchain_db.*/
remove_transaction_data() should not remove outputs as well, but currently
subclasses do so. This should be reorganized. The current subclass function
to do so is remove_tx_outputs()
global output index counting should maybe be handled in the parent class,
up for debate.
Need to make sure that aborting a partial add_block after an exception
will not error out part way through due to another exception. This could
be tricky, as "remove_*" might fail because it hadn't been added yet, or
it might fail because of a database issue. This will likely require more
fine-grained exceptions. For now, with the exceptions that exist, I think
that a generic DB_ERROR is suitable for, well, a db error, and the DNE
exceptions should be used as such. This will likely be sufficient to
differentiate.
add_transaction (concrete in parent) takes a pointer to a tx hash as a
parameter. This should be a reference, but perhaps there's a reason it
is not. Needs looking into.
Destructor: should it call sync() and close() if open?
open: db_flags should either be handled in subclass implementations
or standardized among all blockchaindbs.
batch transactions: batch_stop and batch_commit both have commit code,
batch_stop should call to batch_commit instead.
IMPORTANT: make sure BlockchainDB (and/or subclasses) handles correctly
removing partial block data in the event of a failure during a block add.
get_blocks_range: should this throw if the request is out of range, or
should it let get_block_from_height throw and just ignore it as it does
now?
get_tx_list: much the same as above with get_blocks_range.
*important* hardfork functions: verify that documented functionality is true/correct
** Git log **
commit cc3394b38d87785aafe850e58dc2d79cd65d47de
Author: Thomas Winget
Date: Tue Dec 1 17:38:22 2015 -0500
Update BlockchainDB documentation
BlockchainDB is now Doxygen-compliant and its documentation is
up-to-date with recent changes.
src/blockchain_db/blockchain_db.h | 1063 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------
1 file changed, 911 insertions(+), 152 deletions(-)
commit 204ed981b0e114dc122de16d8642ec44745074f1
Author: Thomas Winget
Date: Tue Dec 1 15:03:46 2015 -0500
Rename function for clarity
As BlockchainDB::get_output_key was changed to return a struct containing
the output key along with two other pieces of metadata for the output, the
function name is now changed to be get_output_data.
In addition, the database handles in db_bdb and db_lmdb were renamed
from m_output_keys to m_output_data to mirror this.
src/blockchain_db/berkeleydb/db_bdb.cpp | 28 ++++++++++++++--------------
src/blockchain_db/berkeleydb/db_bdb.h | 8 ++++----
src/blockchain_db/blockchain_db.h | 8 ++++----
src/blockchain_db/lmdb/db_lmdb.cpp | 22 +++++++++++-----------
src/blockchain_db/lmdb/db_lmdb.h | 8 ++++----
src/blockchain_utilities/blockchain_dump.cpp | 2 +-
src/cryptonote_core/blockchain.cpp | 10 +++++-----
7 files changed, 43 insertions(+), 43 deletions(-)
I'd like to thank everyone for their patience with me, as I've been very slow to complete this task. I will admit that a large part of that is motivation, as it is a rather boring task. That said, another part of it is frustration. There have been a lot of great changes in the source over the past year, and I have not kept up with even half of them (other than reading commit logs). As I have worked to document some of these changes, I have seen changes which, while I agree with what they do, I disagree with how they've been done (from an organization/maintainability standpoint). I admit 100% that this is an issue of pride, and I will work past it.
As I work to complete this contract, I am taking notes of the things I disagree with. Obviously mine is but one opinion, so I will not change any of these things for this contract (except obvious things like replacing BOOST_FOREACH with C++11 for, etc.). After I've completed this contract I will review my notes and see if I can come up with solutions, then submit them for review.
Again, thank you all (especially those who donated to this effort, but also to the community at-large) for your patience. I intend to spend the whole day working on more documentation, so look for an update at or before 00:00 UTC Wednesday morning.
tl;dr - look for next milestone update at or before the coming midnight UTC.
thanks for sticking with it man!
How is it going? Also, was the 4x20 hours meant as consecutive? Or just arbitrary. I am looking forward to hearing some new ideas as well btw :)
Update #2:
Second 20 hours completed. Progress can be seen on the branch on my fork.
Rather than retype all of the commit messages here, you'll find a paste of "git log --stat" below, or at least hopefully you will, this preview thing hates me.
Please feel free to have a look through the generated Doxygen and let me know if anything seems unclear or incorrect (including Doxygen I haven't touched yet, it all needs to be correct eventually, and if you spot something I may as well know about it!)
commit 4adba9bf3b3de73c31f2c34eb6afa284935373d5 Author: Thomas WingetDate: Wed Oct 7 22:28:59 2015 -0400 Change Doxyfile, Blockchain not blockchain_storage Changes the Doxyfile to expand preprocessor macros, but only the ones defined in the Doxyfile. This way we can specify that BLOCKCHAIN_DB == DB_LMDB for the sake of documentation. Doxyfile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 31896712c05a5bca63128a2a3061ba07935b913f Author: Thomas Winget Date: Wed Oct 7 22:28:54 2015 -0400 remove defunct code from cryptonote::core src/cryptonote_core/cryptonote_core.cpp | 13 ------------- src/cryptonote_core/cryptonote_core.h | 10 ---------- 2 files changed, 23 deletions(-) commit 9366319367e86f5802de30607f1f22772e171bda Author: Thomas Winget Date: Wed Oct 7 22:28:51 2015 -0400 cryptonote::core doxygen documentation src/cryptonote_core/cryptonote_core.cpp | 2 +- src/cryptonote_core/cryptonote_core.h | 612 ++++++++++++++++++++++++++++++-- 2 files changed, 588 insertions(+), 26 deletions(-) commit f2cc3a9eaf51bb91c3fa8aff2dc8c796872cfe42 Author: Thomas Winget Date: Wed Oct 7 22:28:43 2015 -0400 doxygen documentation for difficulty functions src/cryptonote_core/difficulty.cpp | 8 +++++--- src/cryptonote_core/difficulty.h | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 3 deletions(-) commit 37fa7b44d69225dfa855cb0e7fe10d53d515e402 Author: Thomas Winget Date: Wed Oct 7 22:28:37 2015 -0400 Move checkpoint functions into checkpoints class The functions in src/cryptonote_core/checkpoints_create.{h,cpp} should be member functions of the checkpoints class, if nothing else for the sake of keeping their documentation together. This commit covers moving those functions to be member functions of the checkpoints class as well as documenting those functions. src/cryptonote_core/CMakeLists.txt | 2 - src/cryptonote_core/blockchain.cpp | 8 +- src/cryptonote_core/blockchain_storage.cpp | 8 +- src/cryptonote_core/checkpoints.cpp | 226 ++++++++++++++++++++++++ src/cryptonote_core/checkpoints.h | 67 +++++++ src/cryptonote_core/checkpoints_create.cpp | 271 ----------------------------- src/cryptonote_core/checkpoints_create.h | 48 ----- src/cryptonote_core/cryptonote_core.cpp | 4 +- src/daemon/core.h | 1 - 9 files changed, 303 insertions(+), 332 deletions(-) commit 04055a0634c1794f1dd8c9e3d0728e3681253646 Author: Thomas Winget Date: Wed Oct 7 22:28:31 2015 -0400 doxygen documentation for checkpoints.{h,cpp} All functions in src/cryptonote_core/checkpoints.h are now documented in doxygen style. checkpoints.cpp has been reviewed, one function has been marked for discussion on correctness. src/cryptonote_core/checkpoints.cpp | 5 +- src/cryptonote_core/checkpoints.h | 100 +++++++++++++++++++++++++++++++++++- 2 files changed, 99 insertions(+), 6 deletions(-) commit 910bcc077d0eabe23af48fbed70024fd3783f2df Author: Thomas Winget Date: Wed Oct 7 22:28:26 2015 -0400 documentation updates to Blockchain Reviewed and updated or removed FIXME and TODO comments src/cryptonote_core/blockchain.cpp | 40 ++++++++++++++------------------------ 1 file changed, 15 insertions(+), 25 deletions(-) commit 6fe7aedae9e52cb98187a060d0633871468c47d9 Author: Thomas Winget Date: Wed Oct 7 22:28:11 2015 -0400 minor bugfixes and refactoring - Blockchain should store if it's running on testnet or not - moved loading compiled-in block hashes to its own function for clarity - on handle_get_objects, should now correctly return false if a block's transactions are missing - replace instances of BOOST_FOREACH with C++11 for loops in Blockchain. src/cryptonote_core/blockchain.cpp | 138 +++++++++++++++++++++---------------- src/cryptonote_core/blockchain.h | 12 +++- 2 files changed, 91 insertions(+), 59 deletions(-) commit 94f7615c573fbc4b9d476ba994cf290f90dc8bf4 Author: Thomas Winget Date: Wed Oct 7 22:25:06 2015 -0400 Remove unnecessary or defunct code src/cryptonote_core/blockchain.cpp | 49 -------------------------------------- src/cryptonote_core/blockchain.h | 21 ---------------- 2 files changed, 70 deletions(-) commit 3e3408eb455f168eb88bfc01c197186c2cb12b60 Merge: 3ba43b3 82d7e79 Author: Thomas Winget Date: Wed Oct 7 22:21:43 2015 -0400 Merge upstream changes into documentation branch
It's going, but tedious. I'll post a proper update tomorrow, probably evening EST.
How could this possibly be tedious?
Your fighting the good fight sir. Thank you thank you.