09 Dec 2016

Semantic Mediawiki for personal knowledge management, using templates and a custom userscript

Here I’ll try to document my current setup for links management, which is slowly starting to take form.

Как мы пришли к такой жизни

Since the social bookmarking site Delicious (old links page) is seriously falling apart (which is very sad, I liked it almost as much as I liked Google Reader) I started looking for other alternatives. For some time, I used WordPress LinkLibrary plugin until I felt the hard category system lacked flexibility (you can see on the “Links” page of this blog how cluttered and repetitive it is), I needed _tags _and more ways to organize the links and, possibly, the relationships between them.

Then for a very short time I set up a WordPress installation specifically for links. I was not the first one who attempted this (https://sebastiangreger.net/2014/01/own-your-data-part-1-bookmarks/ as an example), but it did not work out well for me.

As for the existing social bookmarking services, for example http://pinboard.in or http://historio.us/, I did not want to pay and wanted control of my data (thank God the export feature in Delicious worked more often than not, but I don’t want to risk it anymore).

As for the need to “share” it, I want to have access to it from various places and, since there’s nothing private, putting it in the cloud and putting a password on it sounds like an unneeded layer of complication. Lastly — who knows — maybe someone will actually get some use out of it.

Semantic Mediawiki

Mediawiki is the software Wikipedia runs on. Semantic Mediawiki is an open-source extension for it that adds the ability to store and query data on a whole another level.

Semantics means, basically, meaning. The difference between “60”, “60kg”, “My weight is 60kg.”

Traditionally, Mediawiki allows the pages to link to each other, but the exact nature of the connection is not clear, and you can’t use the connections much. Semantic Mediawiki allows to define additional data for every page, and allows to define relationships between pages. The data “Benjamin Franklin was born in the USA in 1706” suddenly becomes searchable, for example as “Give me the people born in America before 1800” and “Give me the list of countries where people named Ben were born”. A link “Benjamin Franklin -> Philadelphia” becomes “Benjamin Franklin was (BORN IN) Philadelphia”.

This is awesome.

After looking at it, I understood that I have immense power in my hands, and that I have no idea how to use it. As in, how to create an architecture that was both meaningful and easy to adhere to.

Seeing all this, I thought it would make sense to upgrade my “Link database” to something much more interconnected and useful, a personal knowledge management system.

And here it is.

The system

Take this page.

Every page has 5 values:

l: The actual URI
t: the title
c: the complexity (how easy/hard is it to read; sometimes I just don’t want to think too much), 1 to 10
r: the rating, also 1 to 10
o: If it’s a page with only one link, around which the content of the page has been built. (As opposed to “Here are 5 links about X”)

Plus, of course, any additional text.

Properties can be set:

1) In the text itself, for example like this:

    [[l::https://plus.maths.org/content/os/issue53/features/hallucinations/index]]
    - [[t::Uncoiling the spiral: Maths and hallucinations.]]

2) Invisibly:

{{#set:
 o=true
 |c=8
}}

3) using the following nice template I’ve written:

http://www.pchr8.net/wiki/index.php?title=Template:B

    [[l::{{{1}}}]] - [[t::{{{2}}}]]. 
    ----
    Complexity: [[c::{{{3|5}}}]]; Rating: [[r::{{{4|5}}}]]; Is only link: [[o::{{{5|true}}}]]
    
    
    {{#set:
     l={{{1}}}
     |t={{{2|1}}} <!-- If no title given, use URI as name -->
     |c={{{3|5}}} <!-- 5 as default value -->
     |r={{{4|5}}} <!-- 5 as default rating unless something else given
     |o={{{5|true}}} <!-- only link by default -->
    }}

which can be used like this:

https://www.fastcodesign.com/3043041/evidence/why-our-brains-love-high-ceilings
|Why our brains love high ceilings
|5
|7
}}

My main goal for this was that it should be fast, and fast for me. I can type the above much faster than I can multiple input boxes in a hypothetical GUI.

Then I decided to write some bad javascript to simplify it even more.

The bookmarklet/userscript

An actual bookmarklet would be definitely the next thing I'm doing, until then I'll be adding the pages manually.

But I wrote a small script (two years since I've used any Javascript, haha), to minimize the text above to just this:

https://www.fastcodesign.com/3043041/evidence/why-our-brains-love-high-ceilings
Why our brains love high ceilings
5
7

The (badbadbad) Javascript code is the following:

var lines = $('#wpTextbox1').val().split('\n');

for (i=0; i<5; i++) {
if (typeof lines[i] == 'undefined') {lines[i]='';}
}

if (!ValidURL(lines[0])) {alert(lines[0]+" doesn't look like a valid URL.")};
if (lines[1]=='') {lines[1]=lines[0]};
if (lines[2]=='') {lines[2]='5'};
if (lines[3]=='') {lines[3]='5'};

if (parseInt(lines[2]) > 10 || parseInt(lines[2])<0 || isNaN(lines[2])) {
alert(lines[2]+'is not a valid value, setting to default 5');
lines[2]='5';
}

if (parseInt(lines[3]) > 10 || parseInt(lines[3])<0 || isNaN(lines[3])) {
alert(lines[3]+'is not a valid value, setting to default 5');
lines[3]='5';
}

var text="{{B|\n"+lines[0]+"\n|"+lines[1]+"\n|"+lines[2]+"\n|"+lines[3];
if (lines[4]!='') text+="\n|"+lines[4];
text+="\n}}";

var field = document.getElementById('wpTextbox1');
var textArray = field.value.split("\n");
textArray.splice(0, 4);
textArray[0] = text;
field.value = textArray.join("\n");

function ValidURL(str) {
var pattern = new RegExp('^(https?:\\/\\/)?'+
'((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.?)+[a-z]{2,}|'+
'((\\d{1,3}\\.){3}\\d{1,3}))'+
'(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+
'(\\?[;&a-z\\d%_.~+=-]*)?'+
'(\\#[-a-z\\d_]*)?$','i');
return pattern.test(str);
}

The minimized variant of the above now sits nice in my bookmarks bar, and is bound to a keypress in cvim. So I can fill just the URI, and it sets everything else to some default values and adds the Mediawiki template formatting.

TODO:

Getting the page title automagically (see http://stackoverflow.com/questions/10793585/how-to-pick-the-title-of-a-remote-webpage), I'll need a PHP backend. It would be also interesting to check from the PHP if the IP making the request is currently logged in in my wiki, and get the title only then, to prevent abuse
Making a bookmarklet which populates automatically most of the fields, like my old Delicious bookmarklet (sigh.)

Searching the wiki

The search in Semantic Mediawiki is explained pretty well here. Now I can do neat things like "Give me the pages in the Category 'To read' with complexity < 4". And lastly, categories can be inside other categories. If X is in category A, which is a subcategory of B, it still shows up in searches for category B. (example) Pretty nice!

Knowledge Management

Things I want to learn or will probably need pretty often will have their own pages, like the Formulating Knowledge page. Simply because interacting with the material always helps much more than just reading it. Also I like that it will be represented in a way relevant for me, without unnecessary data and with additional material I think should be there.

For the link pages, there will be the link + very short summary (it has been working pretty well) + a couple of thoughts about it, + maybe relevant data or links to other pages.

TODO: Quotes + Move there my "To Read" / "To Listen to" lists. Also think of a better name for it.

Why?

Warum einfach, wenn es auch kompliziert geht? (A nice German phrase about avoiding the unbearable simplicity of being: "Why simple, when it can be complicated as well?")

On a serious note, I don't have any doubts that in the long run I'll be thankful for this system.

Firstly, I control all of this data. Feels good. Take that, capitalist ad-ridden surveillance corporations!

Secondly, working with a lot of information has always been something I do often and enjoy immensely, and it would make sense to start accumulating everything in one place. Every day I stumble upon a lot of material on the Internet, of very different nature, and with not-obvious connections between them. I have more interests than I can count.

Organizing everything like this so far looks to me the best alternative, and I'm reasonably certain it will work out. There's a lot that can be improved, and I think in a couple of moths it will morph into something awesome.

Finding ways to use all the accumulated data is a topic for another day.

(Y)

A couple of nice relevant inspiring places:

http://yourcmc.ru/wiki/ - in Russian, a person using Mediawiki as central hub for everything.

http://konigi.com/wiki/ - personal wiki, mostly design.

http://thingelstad.com/2012/bookmarking-with-semantic-mediawiki/ a much more advanced version of what I'm trying to do, also using Semantic Mediawiki. I should drop him a line :)

Nel mezzo del deserto posso dire tutto quello che voglio.

serhii.net