...making Linux just a little more fun!

<-- prev | next -->

Building a simple del.icio.us clone

By Jimmy O'Regan

I recently made the move from Mandrake to Ubuntu, and while I was unpacking in my new $HOME (sorry, I couldn't resist), I came across a little del.icio.us clone I wrote in PHP to test out the RDF generating script I wrote (well, adapted) to go with my article about outliners.

I decided to use PHP, basically because it's the only language I've used for web programming (aside from a misguided moment in my first year in college when I wrote something in Pascal). I decided to use SQLite for the database because I didn't feel like installing MySQL: PHP's SQLite functions are pretty similar to the MySQL equivalent anyway, so it's no big deal.

Please note that what I am presenting in this article does very little: I needed to test a script, and only cloned the parts I needed to do that. I did go a little further, but forgot about it until now. Doing something useful is for a future article!

Because I only needed it to give me simple XML output, I managed to get all I needed from a single script. First, I set up a sample database:

BEGIN TRANSACTION;
CREATE TABLE bookmarks (url TEXT, title TEXT, desc TEXT, keywords TEXT,
date TEXT, id INTEGER PRIMARY KEY);
INSERT INTO bookmarks VALUES('http://sqlite.org/datatype3.html',
'Datatypes In SQLite Version 3','','sqlite programming','2004-10-16T20:23:49Z',1);
INSERT INTO bookmarks VALUES('http://sqlite.org/lang.html',
'Query Language Understood By SQLite','','sqlite programming','2004-10-16T20:25:36Z',2);
INSERT INTO bookmarks VALUES('http://www.team-teso.net/releases.php',
'releases of teso','','computing 404','2003-07-15T22:29:38Z',3);
INSERT INTO bookmarks VALUES('http://ssshotaru.homestead.com/files/aolertranslator.html',
'The AOLer Translator','','humour','2003-11-18T00:45:35Z',4);
INSERT INTO bookmarks VALUES('http://www.onlineconversion.com/unix_time.htm',
'Online Conversion - Unix time conversion','','misc','2004-10-16T20:43:44Z',5);
COMMIT;

Then, I wrote a script to give me the output:

<?php echo '<?xml version="1.0" standalone="yes" encoding="UTF-8"?'.'>' ?>
<!-- http://ie.php.net/manual/en/language.basic-syntax.php#41654 -->

<posts tag="" user="">
<?php 

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
{
        $result = sqlite_query ($db, "SELECT * FROM bookmarks");
        while (sqlite_has_more($result))
        {
                $post = sqlite_fetch_array ($result);
		
		$url = 'href="'.htmlentities($post['url']).'"';
		$title = 'description="'.htmlentities($post['title']).'"';
                if ($post['desc'] != "")
		{
	                 $desc = 'extended="'.htmlentities($post['desc']).'"';
		}
		else
		{
			$desc = "";
		}
		$date = 'time="'.$post['date'].'"';
		# Don't know if this is how it's done, but it's close enough
		$hash = 'hash="'.md5($url).'"';
		$tags = 'tag="'.$post['keywords'].'"';

		print " <post $url $title $desc $hash $tags $date />";
		printf ("\n");
       }
}
?>		
</posts>

Running this with the sample database gave me this output:

<?xml version="1.0" standalone="yes" encoding="UTF-8"?><!-- http://ie.php.net/manual/en/language.basic-syntax.php#41654 -->

<posts tag="" user="">
 <post href="http://sqlite.org/datatype3.html" description="Datatypes In 
 SQLite Version 3"  hash="56faa06a48016408c5042c7e4bfd3c24" tag="sqlite 
 programming" time="2004-10-16T20:23:49Z" />
 <post href="http://sqlite.org/lang.html" description="Query Language 
 Understood By SQLite"  hash="7a7eb0095ca227e7003c4a0f0a4a1fd9" tag="sqlite 
 programming" time="2004-10-16T20:25:36Z" />
 <post href="http://www.team-teso.net/releases.php" description="releases of 
 teso"  hash="bce6a8d5ecb506ff57be063083253e15" tag="computing 404" 
 time="2003-07-15T22:29:38Z" />
 <post href="http://ssshotaru.homestead.com/files/aolertranslator.html" 
 description="The AOLer Translator"  hash="99fbdd9eb3e03624c65b15d06a82388a" 
 tag="humour" time="2003-11-18T00:45:35Z" />
 <post href="http://www.onlineconversion.com/unix_time.htm" 
 description="Online Conversion - Unix time conversion"  
 hash="f2fd9548118ac815edee17466c58abe1" tag="misc" 
 time="2004-10-16T20:43:44Z" />
		
</posts>

Which is pretty close to the output given by http://del.icio.us/api/posts/get?.

It did what I wanted, it didn't take long to write, and I had fun doing it. So I thought I might try to make it do a bit more.

Next, I decided to tackle the page that returns the list of tags, to run delicious_mind on it:

<?php echo '<?xml version="1.0" standalone="yes" encoding="UTF-8"?'.'>' ?>

<tags>
<?php 

//error_reporting(E_ALL);
//Error reporting? I have no errors! ... erm... not now, at least.

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

// I really should do something with $err
if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
{
	$tags = array();
	$prevtags = array();
	
        $result = sqlite_query ($db, "SELECT keywords FROM bookmarks");
        while (sqlite_has_more($result))
        {
                $foo = sqlite_fetch_array ($result);
		$thistag = split(" ", $foo['keywords']);
		// print_r(array) is worth remembering.
		$prevtags = array_merge($tags);
		$tags = array_merge($prevtags, $thistag);
        }
	natcasesort($tags);
	$uniq=array_unique($tags);
	$count=array_count_values($tags);
        foreach ($uniq as $tag)
        {
		$c = $count[$tag];
                print "<tag count='$c' tag='$tag' />";
		printf("\n");
        }
}
?>		
</tags>

According to del.icio.us's API documentation the posts URL accepts two parameters: tag and date, which allow you to filter the results you receive. The next step was to make the first script do that:

<?php 
echo '<?xml version="1.0" standalone="yes" encoding="UTF-8"?'.'>'; 

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

if ($_GET['tag'] != "")
{
	$tag = $_GET['tag'];
}

if ($_GET['dt'] != "")
{
	$dt = $_GET['dt'];
}

printf ("<posts ");
if ($dt)
{
	printf ("dt='$dt' ");
	$query = "SELECT * FROM bookmarks where date like '%$dt%'";
}
if ($tag)
{
	printf ("tag='$tag'");
	$query = "SELECT * FROM bookmarks where keywords like '%$tag%'";
	// The problem with this is that it matches too much.
	// Using my example, if I search for 'sql' I should get nothing,
	// but instead it matches 'sqlite'
	// Close enough for my purposes, though I should use the stuff I
	// have for extracting the tags.
}
else
{
	printf ("tag=''");
}
printf (" user=''>\n");

if ($tag && $dt)
{
	$query = "SELECT * FROM bookmarks where date like '%$dt%' and keywords like '%$tag%'";
}

if (!$tag && !$dt)
{
	$query = "SELECT * FROM bookmarks";
}

if ($debug) echo "<!-- '$query' -->";

if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
{
        $result = sqlite_query ($db, $query);
        while (sqlite_has_more($result))
        {
                $post = sqlite_fetch_array ($result);
		
		$url = 'href="'.htmlentities($post['url']).'"';
		$title = 'description="'.htmlentities($post['title']).'"';
                if ($post['desc'] != "")
		{
	                 $desc = 'extended="'.htmlentities($post['desc']).'"';
		}
		else
		{
			$desc = "";
		}
		$date = 'time="'.$post['date'].'"';
		// Used the wrong variable in the first example
		$hash = 'hash="'.md5($post['url']).'"';
		$tags = 'tag="'.$post['keywords'].'"';

		print " <post $url $title $desc $hash $tags $date />";
		printf ("\n");
       }
}
?>		
</posts>

As noted in the comments, the tags matched too much: if I specified 'sql' as the tag to filter for, it would match 'sqlite', if that was the last tag in the list. I wrote a simple search script while thinking about what I was going to do next:


<?php

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

if (!$_GET['search'])
{
	echo "<form method='GET'>";
	echo "<input name='search' value='' type='text' size='80'>";
	echo "<input type='submit'>";
	echo "</form>";
}
	
else 
{
	$search = $_GET['search'];
	
	if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
	{
		
		echo "<form method='GET'>";
		echo "<input name='search' value='$search' type='text' size='80'>";
		echo "<input type='submit'>";
		echo "</form>";
		
		$query = "SELECT * FROM bookmarks WHERE url LIKE '%$search%' OR title LIKE '%$search%'";
		$result = sqlite_query ($db, $query);
		while (sqlite_has_more($result))
		{
			$post = sqlite_fetch_array ($result);
			$url = $post['url'];
			echo "<p><a href='$url'>";
			echo $post['title'];
			echo "</a>";
			// keywords, edit
		}
	}
}

In the end I decided to simply pad each tag out with spaces, because it was easier that way, leaving me with new versions of the posts and tags scripts:

(text version)

<?php 
echo '<?xml version="1.0" standalone="yes" encoding="UTF-8"?'.'>'; 

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

if ($_GET['tag'] != "")
{
	$tag = $_GET['tag'];
}

if ($_GET['dt'] != "")
{
	$dt = $_GET['dt'];
}

printf ("<posts ");
if ($dt)
{
	printf ("dt='$dt' ");
	$query = "SELECT * FROM bookmarks where date like '%$dt%'";
}
if ($tag)
{
	printf ("tag='$tag'");
	$query = "SELECT * FROM bookmarks where keywords like '% $tag %'";
}
else
{
	printf ("tag=''");
}
printf (" user=''>\n");

if ($tag && $dt)
{
	// A simple change, to prevent false positives: pad the keywords
	// field with spaces :)
	$query = "SELECT * FROM bookmarks where date like '%$dt%' and keywords like '% $tag %'";
}

if (!$tag && !$dt)
{
	$query = "SELECT * FROM bookmarks";
}

if ($debug) echo "<!-- '$query' -->";

if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
{
        $result = sqlite_query ($db, $query);
        while (sqlite_has_more($result))
        {
                $post = sqlite_fetch_array ($result);
		
		$url = 'href="'.htmlentities($post['url']).'"';
		$title = 'description="'.htmlentities($post['title']).'"';
                if ($post['desc'] != "")
		{
	                 $desc = 'extended="'.htmlentities($post['desc']).'"';
		}
		else
		{
			$desc = "";
		}
		$date = 'time="'.$post['date'].'"';
		// Used the wrong variable in the first example
		$hash = 'hash="'.md5($post['url']).'"';
		$tags = 'tag="'.trim($post['keywords']).'"';

		print " <post $url $title $desc $hash $tags $date />";
		printf ("\n");
       }
}
?>		
</posts>

(text version)

<?php echo '<?xml version="1.0" standalone="yes" encoding="UTF-8"?'.'>' ?>

<tags>
<?php 

//error_reporting(E_ALL);
//Error reporting? I have no errors! ... erm... not now, at least.

if (!extension_loaded("sqlite"))
{
        dl("sqlite.so");
}

// I really should do something with $err
if ($db = sqlite_open("/tmp/bookmarks.sqlite", 0666, $err))
{
	$tags = array();
	$prevtags = array();
	
        $result = sqlite_query ($db, "SELECT keywords FROM bookmarks");
        while (sqlite_has_more($result))
        {
                $foo = sqlite_fetch_array ($result);
		$thistag = split(" ", $foo['keywords']);
		// print_r(array) is worth remembering.
		$prevtags = array_merge($tags);
		$tags = array_merge($prevtags, $thistag);
        }
	natcasesort($tags);
	$uniq=array_unique($tags);
	$count=array_count_values($tags);
        foreach ($uniq as $tag)
        {
		if ($tag != '')
		{
			$c = $count[$tag];
                	print "<tag count='$c' tag='".trim($tag)."' />";
			printf("\n");
		}
        }
}
?>		
</tags>

I needed the database changed to work with these scripts, so I added a new script to generate the SQL. I should have written it to add the data to the database directly, but never got around to doing that.

<?
$url = $_POST['url'];
$title = $_POST['title'];
$desc = $_POST['desc'];
$keywords = $_POST['keywords'];
$date = date("Y-m-d\TH:i:s\Z", $_POST['date']);


$f = fopen("/tmp/bookmarks.sql", "a");
fwrite($f, "
INSERT INTO bookmarks VALUES ('$url', '$title', '$desc', 
'$keywords', '$date', NULL);
");
?>

The form to call the script:

<html>
<head>
	<title>Post Bookmark</title>
</head>
<body>
	
<form method="POST" action="make-sql.php">
<table>
<tr>
<td>URL:</td>
<td><input type="text" name="url" size="80"></td>
</tr>
<tr>
<td>Title:</td>
<td><input type="text" name="title" size="80"></td>
</tr>
<tr>
<td>Description</td>
<td><input type="text" name="desc" size="80"></td>
</tr>
<tr>
<td>Keywords</td>
<td><input type="text" name="keywords" size="80"></td>
</tr>
<tr>
<td>Date</td>
<td><input type="text" name="date" size="80"></td>
</tr>
<tr>
<td><input type="submit" name="submit" value="Submit"></td>
</tr>
</table>
</form>

</body>
</html>

And the output:

INSERT INTO bookmarks VALUES ('http://www.mozilla.org/projects/plugins/scripting-plugins.html', 
'Scripting Plugins with Mozilla', '', ' mozilla javascript ', '2003-08-09T20:58:38Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.mozilla.org/docs/xul/xulnotes/xulnote_xpconnect.html', 
'Fun With XBL and XPConnect', '', ' mozilla xbl ', '2003-08-06T04:20:33Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.mozilla.org/projects/xbl/xbl.html', 
'XBL (Extensible Binding Language) 1.0', '', ' mozilla xbl ', '2003-08-06T04:20:18Z', NULL);
INSERT INTO bookmarks VALUES ('http://layeremu.mozdev.org/files/usage.html', 
'mozdev.org - layeremu: files/usage', '', ' mozilla ', '2003-08-06T00:31:43Z', NULL);
INSERT INTO bookmarks VALUES 
('http://www.mozdev.org/source/browse/~checkout~/gnusto/src/gnusto/content/gnusto-lib.js?rev=1.97&content-type=text/plain', 
'http://www.mozdev.org/source/browse/~checkout~/gnusto/src/gnusto/content/gnusto-lib.js?rev=1.97&content-type=text/plain', 
'', ' mozilla ', '2003-08-17T03:31:59Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.mozilla.org/docs/dom/domref/dom_el_ref31.html#1028304', 
'addEventListener', '', ' mozilla javascript ', '2003-10-12T22:17:33Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.mozilla.org/docs/dom/domref/dom_el_ref47.html#1028897', 
'insertBefore', '', ' mozilla javascript ', '2003-10-12T22:17:34Z', NULL);
INSERT INTO bookmarks VALUES ('http://techpubs.sgi.com/library/tpl/cgi-bin/browse.cgi?coll=0650&db=man&pth=/cat1', 
'SGI TPL Browse Man Pages (User Commands (1))', '', ' unix manpages ', '2003-07-14T09:48:03Z', NULL);
INSERT INTO bookmarks VALUES ('http://docsrv.caldera.com/', 
'SCOhelp', '', ' unix manpages ', '2003-07-14T09:48:03Z', NULL);
INSERT INTO bookmarks VALUES ('http://docs.hp.com/hpux/os/man%5Fpages.html', 
'hp-ux reference (manpages)', '', ' unix manpages ', '2003-07-14T09:48:03Z', NULL);
INSERT INTO bookmarks VALUES ('http://sun.doit.wisc.edu/', 
'DoIT/POST SUN Home Page', '', ' unix manpages ', '2003-07-14T09:48:03Z', NULL);
INSERT INTO bookmarks VALUES ('http://publib16.boulder.ibm.com/cgi-bin/ds_form?lang=en_US&viewset=AIX/', 
'AIX Documentation', '', ' unix manpages ', '2003-07-14T09:48:03Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.toccionline.com/creations/ctrla/how.html', 
'CTRL+A Images - Make Your Own', '', ' computing ', '2003-07-19T13:37:06Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.sco.com/developers/gabi/', 
'SCO | Developers | GABI', '', ' computing ', '2003-07-17T00:56:32Z', NULL);
INSERT INTO bookmarks VALUES ('http://openpalm.sourceforge.net/faq.html', 
'FAQ of the OpenPalm Project', '', ' computing ', '2003-07-14T10:19:25Z', NULL);
INSERT INTO bookmarks VALUES ('http://www.improvisation.ws/mb/tpcs1.php', 
'Improv Message Boards - True Porn Clerk Stories', '', ' misc ', '2003-06-10T23:59:11Z', NULL);
INSERT INTO bookmarks VALUES ('http://singsmart.com/freesingingarticles.html', 
'Sing Smart, Not Hard with Vocal Coach Yvonne DeBandi', '', ' misc ', '2003-06-09T23:22:34Z', NULL);

Now that I've made my terrible PHP code public, I guess I'll have to finish off the job. Coming in part 2:

 


[BIO] Jimmy is a single father of one, who enjoys long walks... Oh, right.

Jimmy has been using computers from the tender age of seven, when his father inherited an Amstrad PCW8256. After a few brief flirtations with an Atari ST and numerous versions of DOS and Windows, Jimmy was introduced to Linux in 1998 and hasn't looked back.

In his spare time, Jimmy likes to play guitar and read: not at the same time, but the picks make handy bookmarks.

Copyright © 2005, Jimmy O'Regan. Released under the Open Publication license

Published in Issue 110 of Linux Gazette, January 2005

<-- prev | next -->
Tux