thinkbase

Problem:

I read a lot of books, but often forget what I read, even if I understand it. What I’d love to have is a perfect memory, so whenever in conversation say ‘tradeoffs between different fault tolerance abstractions in database design’ (Kleppmann - Designing Data Intensive Applications) or ‘the different ways narrow AI could develop to superintelligence’ (Bostrom - Superintelligence: Paths, Dangers, Strategies) come up I’d have all their arguments, summarised and with supporting examples ready to discuss. Crucially, though when I reference those books in a conversation, assuming I can remember them, I do not just regurgitate the whole book verbatim, instead I can extract the specific concepts that are relevant to the conversation.

Best-guess solution:

Data input

Notes are added in an unstructured text format summarising ‘atomic’ concept or idea with a couple of paragraphs with a source reference to a page or web link. Ideally, should be able to scrap books for data input, but more than simply inputting the text, instead we need to input the concepts into the database.

Data selection

Rather than tags content is ‘searchable’, because do not want to constrain selection to pre-planned input formats. However, searching should not be at a word-based level but at an idea level of abstraction.

Data aggregation

Furthermore, it shouldn’t involve simple searching for individual entries, but also allow ‘concept aggregation’ such as creating lists or comparing and contrasting different sources.

Problems with other solutions:

Google

  1. Searches text rather than concepts. Ultimately, wrong level of abstraction where have 1. data/text/facts 2. concepts/ideas 3. arguments/essays. We want to search at the second level.

  2. We also want to be able to aggregate to level 3 abstractions e.g. you might google causes of the first world war and get 10 links, but you don’t get an aggregated list of causes drawn from multiple sources.

Wikipedia

  1. Wikipedia stores information including level 2 and 3 abstractions in a standardised, high quality format. However, level 2 abstractions tend to be buried in long pages of text, and level 3 abstractions are rarely the ones you specifically want, although lists are pretty good.

  2. Also Wikipedia, lacks information from books (perhaps for IP reasons)

Notetakers

  1. Long notes which are difficult to search. Usually kept at a much higher level of aggregation such as a long article length.

  2. Require lots of manual input and upkeep.

Databases

  1. store information at too low a level of abstraction i.e. facts/ideas rather than concepts or arguments

Graphs

Books

  1. Problem with books is that they are too long, and it is not easy to move down the conceptual hierarchy from long arguments to component concepts.

  2. Also, they cannot be dynamically searched and mixed and matched.

Technical hurdles:

  • How search if do not have pre-defined graph/relational structure and not relying upon literal text search?

  • How extract concepts from a book? Need to run unsupervised machine learning but guided by feature/structural hints? Maybe if run two independent unsupervised learning algorithms on two different books how do you compare feature sets?

  • How create abstraction aggregations? Use deep learning to find them or manually add say ‘create a list’ or ‘create a set of advantages and disadvantages’ etc?

MVP:

  • I’ve just started using simplenote and i’m writing little concept-level notes for myself. I’m not using any tagging but going to rely solely on search function. Goal is to see what it is like, how well it works and where it could be improved.

Screenshot from 2018-12-31 06-03-47.png

MVP Findings:

  • 03/01/2019 - Reading Ray Dalio’s Principles I want to not only store the concept, but also apply it. How can database be enhanced that I can write notes/apply ideas (perhaps from multiple sources) and then store/re-access those updates later?

  • 03/01/2019 - Difficult questions about what abstractions/notes to store. Intuition is to think about what is useful to actually use, but more precise criteria are not clear. E.g. Dalio says should systematize knowledge - is that a concept? Is that worth including? I find that lists or more extended recipes are more useful, which is interesting because it suggests itself an initial level of aggregation. Is it possible to generalise that types of concept level abstractions (which include lists, theory, etc.)?

  • 04/01/2019 - When creating comparison between derived data systems and distributed transactions with atomic commit from Kleppmann I find there are a lot of definitions I’d like to reference. Current solution is to include definitions in a list at the bottom of the page, but clearly a better structure is for them to be separate concepts/definitions which can then be referenced. This is essentially a hyperlink, but perhaps an implicit one (i.e. if one concept includes the term definition and total order broadcast and other concept uses the term total order broadcast perhaps these could be matched?) is better than a manually hard-wired explicit one? Clearly cannot just switch back and forth between different notes, so it would be useful to be able to create two windows, one for concept and second for definitions and have searching for a concept or set of concepts also bring up list of related definitions.

  • 04/01/2019 - Question is how disaggregated should concept be? Ideally you want to be able to aggregate everything up from concepts, so let us assume that for any concept in database you do not want to ever change or take a subset of that text, therefore, there needs to be a separate concept text for implementation, advantages, disadvantages, definition, comparison etc. of every concept. This quickly becomes unwieldly from a data input stand-point. And also harder, to output unless have good aggregation functions.

  • 05/01/2019 - If you could reliably classify text into type of structure e.g. definition, theory, example, advantages, disadvantages, comparison etc. and then build a graph of how that text relates to each other then I think it would be relatively easy to search/explore that space. Maybe a good starting-point would be to have users text from electronic books to ‘save’ requiring a structural tag, which would then give something to train on in the long-run to extract those concepts oneself. Or perhaps even average over lots of peoples’ hand-written notes etc.

  • 11/01/2019 - How do you have an individual’s tool to help work toward the truth? How do you have an individual tool for structured, systematic decision making? Many some concepts in think|base can act like templates which might be a series of list queries, but structured by questions e.g. what would convince me that I’m wrong, exploring the unknown/unclear/not definitely true decision space systematically. Could this be scaled to company wide decision making?

  • 11/01/2019 - Provocactive question: what would the tool look like if you knew that you have no memory about the last 24 hours? I.e. can think long and hard about problems each day, but cannot, without prompts, remember anything you read etc.

  • 13/01/2019 - want to write notes, e.g. ask myself dalio’s question of what are my principles, but how to store notes? if store as concepts then what about privacy and separation from text notes? also would like to be able to track over time - e.g. principles from 2019 principles from 2020 etc, there need an aggregate function associated with timeline. Also it would be nice to be able to have condensed, summary notes for personal musings but then also free-ranging, brain dump notes as well.

  • 13/01/2019 - so going through dalio’s simple framework of 1. what do you want 2. what is true 3 what are you going to do about it? - immediately it becomes obvious when i’m writing my personal reflections into the database i want to be able to a) have dual screen so i can see a concept and write my reflections on the concept etc. b) want to be able to link my notes to dalios principles, so that later can bring up dual screen perhaps. Currently doing manually with a reference to Dalio’s framework. How would I write notes on multiple frameworks? Probably as separate things - but it might be nice to integrate notes to compare across and integrate.

  • 17/01/2019 - If add notes at a concept level, then how pre-bake in aggregations e.g. Dalio’s 5 step process to get what you want out of life. Some aggregations will be impossible for computer to work out itself, so need to be able to add them.

  • 22/01/2019 - I want to be able to output a list, of all the different snippets of information on HDFS, H-Base, YARN, Impala, Hive etc. that I read on various books. Unfortunatley how would that work? For example, if one snippet talks about HDFS and compares it in passing to RAID, do I want to a list of RAID-related notes to include it as well as HDFS-related list? Basically, need a way to tier associations.

  • 22/01/2019 - As an MVP I think the key feature is being able to search text, and stack notes on top of each other to make a list, which can be manually edited etc. to refine.

  • 30/01/2019 - It is boring, and long to copy out meta-data (think source and summary) and write lots of small blocks of insight . e.g. rather than 4 separate advantage of HDFS blocks notes much easier to write all the HDFS blocks material at once. However, perhaps it could be easy to tag metadata/source paragraphs and when aggregating do not return whole note but paragraphs within the note. At the very least, could automatically store each paragraph as a separate note to be searched and aggregated.

  • 01/04/2019 - Writing literature review notes in overleaf but there are a number of problems 1. It is slow 2. I cannot search text within document or across documents 3. Annoying to upload images 4. Notes inevitably have less detail then I’d like. Ideal solution would be a shift-print type screenshotter which automatically converted screenshotted text (say a paragraph from an article) and put in a database as searchable text, with a given source, author title, paper name etc. so that if I want to search all possible references to say non-maximal suppression

  • 22/05/2019 - I want to prepare competencies but I find because I want them together writing in latex feels easier, not being able to see them all together is annoying as is printing them out in a nice format.