Monday, 9 April 2012

bibtexbrowser... Music for Publication Lists (Part II)

Embed bibtexbrowser in your web page


« Part I - A journey through the realms of boredom...

In Part I of this post I shared my thoughts on personal publication lists and I documented my reasons for using Martin Monperrus' bibtexbrowser in my web page.

The script is very simple to install and works off-the-shelf. However, with a bit of tweaking we can achieve better integration with the rest of our site. In this part, I shall share my experiences from customising it, hoping that it might help others who decide to use it.

Modes

We can run bibtexbrowser in 3 modes:
  • Stand Alone: When you directly invoke bibtexbrowser.php and let it generate an entire page. This will look like this or like this (frameset display).
  • Embedded: When you use it inside your own page and you get it to generate a section. A page like that can be seen here.
  • As a library: In this mode, it parses your bib DB and populates its data structures but it doesn't generate any HTML until you ask it to do so at a later stage. More on that later.

Basic Configuration

bibtexbrowser.php includes bibtexbrowser.local.php. This is your chance to override the default configuration and make local changes. So for example if you want to turn off Javascript progressive enhancement, you'd add the line below to your bibtexbrowser.local.php:
define('BIBTEXBROWSER_USE_PROGRESSIVE_ENHANCEMENT',false);
It's documented quite well here, so not a lot to say. It's being mentioned here cause we'll use it plenty later on.

Embedded Mode

This is also well documented in bibtexbrowser's page. Let's assume that you have a php script called pubs.php and you want to embed your publications list. You will do something like this:
<?php
  $_GET['bib']='mybib.bib';
  $_GET['author']='Your Name';
  include( 'bibtexbrowser.php' );
?> 
That's it. The only thing to do afterwards is write CSS styles to make the generated output looks like the rest of your page.

Library Mode

But in frameset mode there are great and handy authors, years, types and other menus on the left. Why can't I have them in embedded mode?

If you browse through bibtexbrowser's code, you will see something about a 'New undocumented feature' (in version v20111211 this is near line 1746). In your embedding pubs.php script where you include bibtexbrowser, you can do this (notice line 1):
$_GET['library']=1;
$_GET['bib']='mybib.bib';
$_GET['all']=1;
include( 'bibtexbrowser.php' );
/* We have included bibtexbrowser but it's not generated anything yet */

setDB(); /* Read the bibliography and populate data structures */

new IndependentYearMenu(); /* Generate the years menu */

/* Do more stuff */

new Dispatcher(); /* Generate bibtexbrowser HTML */
This IndependentYearMenu() function generates the HTML for the year menu (it does, honest!). Remember, your bibtexbrowser.local.php is your friend. You can copy this function over from bibtexbrowser.php and modify it as you please. This is how I'm generating my custom authors menu:
<?php
    class CustomAuthorsMenu {
        function CustomAuthorsMenu() {
            if (!isset($_GET[Q_DB])) {die('Did you forget to call setDB() before instantiating this class?');}
            $authorIndex = $_GET[Q_DB]->authorIndex();
            ?>
            <div id="authormenu" class="filterbox toolbox">
                <div class="filterprop">
                    <a href="#authormenu" title="Expand/Collapse" onclick="toggle_toolbox('authorlist'); return false;">±</a>
                </div>
                <span class="this">Authors:</span>
                <div class="filterlist" id="authorlist">
                    <?php
                    echo '<span><a '.makeHref(array(Q_AUTHOR=>'.*')).'>All</a></span>'."\n";
                    foreach($authorIndex as $author) {
                        echo '<span><a '.makeHref(array(Q_AUTHOR=>$author)).'>'.$author.'</a></span>'."\n";
                    }
                    ?>
                </div>
            </div>
            <?php
        }
    }
?>
Between setDB() and new Dispatcher() you can do whatever you like. So this is how I'm loading bibtexbrowser (in my pubs.php):
<?php
    $_GET['library']=1;
    $_GET['bib']='geo.bib;authors.bib';
    $_GET['all']=1;
    include('bibtexbrowser.php');

    setDB();

    new CustomAuthorsMenu();
    new CustomYearMenu();
?>

<div id="bodyText">

<?php
    include('Template/conditions.php');
    new Dispatcher();
?>

</div> 
The process is similar for the years menu. The outcome can be seen here. Check out the author and year menus on the left.

But, what about the search form?

The form is generated by function searchView(), which doesn't get called in embedded mode. Similar process, we can copy it over to our pubs.php and modify it.
<form action="?Academic" method="get">
    <div class="sortbox toolbox">
        <a href="#" onclick="toggle();return false;">Raw List / Grouped<br /></a>
        <a href="?Academic">By Type<br /></a>
        <a href="?Year">By Year<br /></a>
        <div class="search">
            <input type="text" name="search" class="input_box" id="searchtext"/>
            <input type="hidden" name="bib" value="geo.bib"/>
            <input type="submit" value="search" class="input_box"/>
        </div>
    </div>
</form>
Those of you who are observant will have noticed that, compared to the original:
  • I've replaced bibtexbrower's constants with hard-coded values. This is because I'm generating the form at the wrong position in the script, we can do it after including bibtexbrowser.php and keep the constants. I'll be changing that soon...
  • I'm putting the entire <div> inside the <form> and not the other way round. This is for XHTML 1.1 compliance. More on that later.

Individual Publication Pages

bibtexbrowser can also display an individual page for a publication, such as for example this one. If you look at this page's source, you will notice many lines like these:
<meta name="DC.Title" content="A Model-driven Measurement Approach"/>
<meta name="citation_title" content="A Model-driven Measurement Approach"/>
These lines are my favorite bibtexbrowser feature: bibliographic metadata. The first one is Dublin Core, the second is google scholar metadata. The problem here is that if you run bibtexbrowser embedded, the script that generates the page's head is the embedding script, not bibtexbrowser. Bottom line, I advise against running individual pages embedded.

An idea that I'm planning to have a stab at is to change the bibtexbrowser script so that when it encounters the argument key,  it will generate the metadata headers and COinS before bailing out. Then, the embedding script can call suitable methods, retrieve the values and add them to the generated page's head.

XHTML 1.1

bibtexbrowser generates valid XHTML 1.0 Transitional. That's great but that rest of my site is XHTML 1.1. If we try to validate the page as 1.1, we get the following error:

Line X, Column Y: there is no attribute "name"
<td class="bibref"><a name= " 2">

The workaround to this one is simple, just change name="foo" to id="foo", like so:
--- /Users/cexgo/Documents/spd.gr/bibtexbrowser.php.txt 2012-04-05 14:23:24.000000000 +0100
+++ bibtexbrowser.php   2012-04-09 00:54:01.000000000 +0100
@@ -1336,28 +1338,41 @@ class BibEntry {
   */
   function toTR() {
         echo '<tr class="bibline">';
-        echo '<td  class="bibref"><a name="'.$this->getId().'"></a>['.$this->getAbbrv().']</td> ';
+        echo '<td  class="bibref"><a id="id'.$this->getId().'"></a>['.$this->getAbbrv().']</td> ';
         echo '<td class="bibitem">';
         echo bib2html($this);
We're not done yet. Even if the markup is valid XHTML 1.1, we need to make sure that page is served as such. When running in embedded more, things are easy since the headers are generated by our own script. In stand-alone mode though, this is what the server sends:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
We need to change the DOCTYPE and a few other bits and bobs. Although technically Content-type:text/html will work, according to the specs it may not be used and should be replaced by application/xhtml+xml. Here's how I've patched bibtexbrowser.php:
--- /Users/cexgo/Documents/spd.gr/bibtexbrowser.php.txt 2012-04-05 14:23:24.000000000 +0100
+++ bibtexbrowser.php   2012-04-09 00:54:01.000000000 +0100
@@ -2878,13 +2916,13 @@ function HTMLWrapper(&$content,$metatags=array()/* an array name=>value*/) {
 
 // when we load a page with AJAX
 // the HTTP header is taken into account, not the <meta http-equiv>
-header('Content-type: text/html; charset='.ENCODING);
-echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'."\n";
-
+    header ("Content-Type:application/xhtml+xml; charset=utf-8");
+    echo '<?xml version="1.0" encoding="UTF-8"?>'."\n"
 ?>
-<html xmlns="http://www.w3.org/1999/xhtml">
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
 <head>
-<meta http-equiv="Content-Type" content="text/html; charset=<?php echo ENCODING ?>"/>
+<meta http-equiv="Content-Type" application/xhtml+xml; charset=UTF-8"/>
 <meta name="generator" content="bibtexbrowser v20111211" />
 <?php if ($content->getRSS()!='') echo '<link rel="alternate" type="application/rss+xml" title="RSS" href="'.$content->getRSS().'&amp;rss" />'; ?>
 <?php 
A few more things to keep in mind. By default bibtexbrowser opens individual pages in the same window/tab. If you prefer using a new tab, you can override the default by adding this to your bibtexbrowser.local.php:
define('BIBTEXBROWSER_BIB_IN_NEW_WINDOW',true);
If you do that, you will find that some target="foo" will creep in the generated markup. The page will not validate against XHTML 1.1 with errors like this:

Line X, Column Y: there is no attribute "target"
 …an</a>, George Oikonomou, "<a target= " _blank"

I don't have a workaround for the target="_blank" thing since it's a feature that I'm never going to use, I just had to point it out since I noticed it. Lastly, keep in mind that the frameset version will not validate either, due to the differences in framing methods between XHTML 1.0 and 1.1. If you try to solve this, you are probably looking at quite a beast.

Summary

In order to better integrate bibtexbrowser with my publication list here, I made a few modifications and tweaks. I believe that bibtexbrowser is an excellent script. In this post I am sharing my observations, hoping to help those of you who want to exploit some of its cool features and who are curious enough to want to take things a few steps further than a vanilla installation.

« Part I - A journey through the realms of boredom...

Sunday, 8 April 2012

bibtexbrowser... Music for Publication Lists (Part I)

A journey through the realms of boredom...

If your job involves writing academic papers, you have no real option but to maintain a publications list on your web page.

<rant>
To make things worse, in addition to an author's personal page, there is the institutional page which also needs kept up-to-date.
Some are still stuck with good old ~geo/public_html pages and manual updates. Most organisations have taken a step forward and have implemented their own bespoke publication management systems ('institutional repositories'). Those systems usually share some common characteristics: i) a poor attempt at a catchy "SOME-ACRO" name, ii) they cost $£¥€ to develop and iii) they are quite rubbish...
I mean seriously people, import from bibtex must be the first feature to implement, yet whenever I've landed a new job I've had to spend hours in front of some appalling, counter-intuitive, pointy pointy clicky clicky web-based interface, manually data-entering my (short) publication list. And then the management complains if the list is not being kept updated.
I can't even begin to imagine life for people with very long lists, (although I suspect they also have the option to delegate the task). Sometimes I wonder if this is why the term 'Selected Papers' was invented in the first place...
</rant>

But back to our personal web page. Doing everything manually is obviously not an option; it's tedious, boring, error-prone and deprives you of all the features that you could get if you were to generate the page dynamically (sort, search etc).

I simply could not possibly be bothered...

First Paradigm: Generate offline, serve statically

One can write a set of scripts to generate static HTML content from bibliographic data. Whenever the data are updated, content must be re-generated. This however does not necessarily have to be triggered manually by the user. It can be, for example, a cron job which runs every few hours. This is a lot less burden on your back-end than generating a page on the fly every time it's accessed. This is actually a very good approach, since the frequency of updates to a publication list is very low compared to the frequency at which it's being accessed.

This relies on having scripts which can write directly to the file system hosting the content. If your personal page is hosted in your own box or on some company or university server, this is often not a problem. However, if you are using some commercial hosting company this approach is probably not an option.

Dynamic Generation, Take One

(or 'How to make your own life harder than it needs to be')

As discussed earlier, I skipped the manual update business altogether. I would have gone for the approach above but when I mentioned the words shell, cron job, script and make to my hosting provider techies (who happen to be mates), they laughed at me with passion and said "Just generate it dynamically". That was a couple of years ago (or four).

I then had a quick look around the internet. What I was looking for was a system that would read from bibtex files, generate a page and add various links to each entry (DOI, publisher URL, pre-print pdf if available etc). Unfortunately, I couldn't quite find what I was looking for. Most solutions were either overkill or lacked features, so I ended up knocking together my own php-based system. At that time I was having my first go at php so, being rather clueless, I feared it would take me ages to write a bibtex parser.

I ended up with an intermediate solution. Here is how it worked, briefly:
  • As a starting point, I kept each paper in its own bibtex.
  • Using the bibutils suite, I would generate bibliographic data in MODS format - one MODS file from each bibtex file. This was done offline with GNU make. Remember: no shell access to the hosting machine, so this would take place in my own box.
  • The same build system would collate all individual .bib files to a single bibtex DB.
  • I would then upload the updated .bib and MODS files to my server. The MODS format is XML so it was very easy to parse in PHP and generate the pages.
At the start I thought this was really cool but I quickly started stumbling. Every time I had to make a change, no matter how minor, I had to:
  • Run make
  • Identify which bib and MODS files were affected by the change and re-upload them to the server. I'd also have to upload the integrated bib DB (which I would often forget).
A small typo or encoding error in the bib: make and upload. Paper got accepted: make and upload. Paper went live in ieeexplore and got a doi: make and upload. To make matters even worse, my web hosting uses a point and click upload interface so I couldn't even script the upload. I had to do it by hand but it was still a lot better than manually maintained static pages.

It only took a couple of new papers, a few dozen typos and a few thousand clicks to realise that:

I simply could no longer be bothered...

Next Attempt at Dynamic Generation: bibtexbrowser

(or 'I wish I'd spotted this earlier'...)

Thursday the 5th April 2012 was a day of revelation. That time came again when I had to update my publications list but I'd had enough of make/upload cycles. I googled "bibtex php parser" and I came across Martin Monperrus' bibtexbrowser. This excellent php script does exactly what I needed but hadn't found previously: The user uploads a bibtex file to the hosting box, the script generates the publication list on the fly.

The paradigm is the same as my previous system: 'php reads bibliography DB, php generates page'. However, it does so straight from the bibtex file without the need of an intermediate format, thus reducing maintenance overhead from 'annoying' to 'ridiculously trivial'.

It also has a few features that give it even further added value:
  • It can be used standalone, embedded in your own page or as a library. The library functionality is rather awesome but undocumented at the moment, I only found out about it by reading a bunch of comments inside the script.
  • It is very easy to customise.
  • It adds google scholar meta-data to your pages (and eprints and Dublin Core, if you want).
  • It generates COinS metadata so that software like zotero or mendeley can directly import from your page.
I played around with it over the last couple of days. Not only did it make me put the previous system swiftly in the bin, but it also made me want to tell the world how good it is.

In the next part of this post, I am going to share my experiences from customising it.

Multicast Support for 6LoWPANs with the Contiki OS

A few months ago I started work on adding IPv6 multicast support to the Contiki Operating System. This effort has resulted in extensions to Contiki's core networking code and its RPL implementation. Additionally, I have implemented two multcast forwarding algorithms:
  • Multicast Forwarding with Trickle: This implements the multicast algorithm described in this internet draft.
  • Stateless Multicast RPL Forwarding (SMRF): The RPL routing protocol enters Mode of Operation (MOP) 3 and handles multicast group management as per the RPL documents. SMRF is a lightweight engine which handles datagram forwarding.
SMRF is described in detail in this paper, alongside a thorough comparison between the two algorithms in terms of performance, datagram loss rates and energy consumption.

The implementation is currently under review by the core Contiki developer team, awaiting their approval before I can push it upstream. Until that day, it is hosted on github (branch 'mcast-forward'), as a fork of the Contiki OS.

I have tested things with the Cooja simulator (msp430-gcc) as well as with cc2530/cc2430 devices (sdcc). I am especially interested in success/failure reports on devices relying on other toolchains, such as AVR-based nodes or econotags, since I don't have access to the hardware.

Feel free to have a stab!

Thursday, 5 April 2012

Contiki's cc2x30 ports are now upstream

Contiki's ports for TI's cc2530 Development Kit and Sensinode/cc2430 devices have been merged back upstream and I shall be maintaining them actively.

Users who wish to use them, should first read the guides listed here. Examples can be found in $CONTIKI/examples/cc2530dk and examples/sensinode. With a correct SDCC installation, you can expect examples from these directories in Contiki's master branch to compile cleanly.

Examples demonstrating basic functionality (non-uIPv6) should work off the shelf. If they don't, you are possibly looking at a bug so feel free to shout on the contiki-developers mailing list.

For more advanced examples (uIPv6), we need a set of patches to Contiki's core, aiming to reduce stack depth during code execution and prevent overflows. These patches not very suitable for other Contiki platforms so they are highly unlikely to ever be merged upstream. They are currently being hosted on GitHub, on a fork of the Contiki git repo (branch cc-ports). More techy details, whys and hows can be found here.

Enjoy!