PSA: automatic formatting of holds lists

April 2nd, 2010

Some time back I read an article talking about how libraries all over the country were individually recreating the wheel, duplicating effort instead of building on someone else’s. There were various reasons for this mentioned: that some librarians didn’t want to feel obligated to maintain code put out in the public, that some were embarrassed about code they considered subpar, that some workplaces didn’t allow for that sort of sharing, etc.

In Florida, the work done by state and county employees, as well as municipalities, is all considered in the public domain. So in a spirit of sharing, in the hopes it will help someone outside our district, here’s this code I wrote in PHP to format holds lists generated by Sirsi-Dynix (a good system not to have, IMO).

The code is far from perfect, but on my laptop–with my laptop doing the processing, not sending the list out and waiting for a response–a 100-page report gets accepted, processed, and returned streamline in literally just a few seconds. So I’m considering it good enough to call “done” and move on.

The script can of course be modified or extended to fit your own needs, according to your own departmental breakdown and/or stack arrangement. The important thing is that I think it can give a starting point on how to approach the problem of overly verbose lists (it turns a 102-page report into a 66-page one).

With those caveats:


<?php
// check whether the form's been filled out
$list = ($_POST['list']);
$dept = ($_POST['dept']);

// removing extra line breaks at beginning and end
$list = trim($list);

// checking to see if there's actually a list
if (empty($list))
	{
	header("Location: workflows.php?status=notext");
	}

else {
	print "<html>\n<head>\n";
	print "<style type=\"text/css\">";
	print "body {\n    font-family:'Arial', 'Helvetica', sans-serif }";
	print "</style>";
	if ( (empty($_POST['dept'])) || (!isset($_POST['dept'])) )
		{print "<title>Holds list: All</title>";}
	elseif ($dept == 'youthserv')
		{print "<title>Holds list: Youth Services</title>";}
	else // Adult Services
		{print "<title>Holds list: Adult Services</title>";}
	print "</head>\n<body>\n";
	}

$jmedia = array();
$easies = array();
$jnonfic = array();
$jfic = array();
$ya = array();

$anonfic = array();
$av = array();
$afic = array();
$avhs = array();

$detritus = array();

// removing extra spaces throughout
$list = preg_replace('/( ){2,}/', ' ', $list);

// remove repeating header
// remove HOLD PICKUP LIST
$orig = "/HOLD PICKUP LIST/";
$list = preg_replace($orig, "", $list);

// remove Production date
$orig = "/Produced ([A-Za-z]{3}) ([A-Za-z]{3}) ([0-9]{1,2}) [0-9]{2}:[0-9]{2}:[0-9]{2} ([0-9]{4})/";
// first try to store this as a variable for later output
preg_match($orig, $list, $matches);
unset ($matches[0]);
$produced = implode(" ", $matches);

// now remove Production date from header
$list = preg_replace($orig, "", $list);

// remove Production library
$orig = "/Library\:[ ]+[A-Za-z]+(\-[A-Za-z]+)?/";
$list = preg_replace($orig, "", $list);

// Converting line breaks
$list = preg_replace("/( )*\n+/", "<br />", $list);

// Removing extra line breaks
$list = preg_replace("/((<br \/>)+\s+(<br \/>)+\s*)+/", "<br /><br />", $list);
$list = preg_replace("/<br \/><br \/><br \/>/", "<br /><br />", $list);

// to get those stray extra line breaks (tried this other ways. no go.)
$list = preg_replace("/<br \/><br \/>[ ]+/", "<br />", $list);

// Removing copy number
$list = preg_replace("/copy\:[0-9]{0,4}/", "", $list);

// Removing type
$list = preg_replace("/type\:[A-Z]{3,}(\-[A-Z]{2,})?/", "", $list);

// Removing location
$list = preg_replace("/location\:[A-Z]{4,}\s{0,}<br \/>\s{0,}/", "", $list);

// Removing pickup library
$list = preg_replace("/Pickup library\:[A-Z]{3,}(\-[A-Z]{2,})?/", "", $list);

// removing line break between callno. and author
$callno = '/(<br \/><br \/>)([A-Z0-9\.\*\,#&-]+)([ ][A-Z0-9\.\*\,#&-]+)([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?\s?<br \/>/';
$replacement = '$1$2$3$4$5$6$7$8 / ';
$list = preg_replace($callno, $replacement, $list);

$listarr = explode("<br /><br />", $list);

// first 'if' here does branch holds
if ( (empty($_POST['dept'])) || (!isset($_POST['dept'])) ) // branch
	{
		foreach ($listarr as $key => $value)
		{
			// group jmedia
			// send off CD JAZZ so all later CD J[A-Z] are in JFic
			if (preg_match('/^CD JAZZ/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/(SOFTWARE|DVD|VIDEO|CD) J[0-9]+/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}
			elseif (preg_match('/^(SOFTWARE|DVD|VIDEO) (E|J)/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}

			// group easies
			elseif (preg_match('/^E /', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			// grabs CDs and Cassettes in area
			// must have grabbed E DVD and VHS first
			elseif (preg_match('/^([A-Z]+[ ])?E [A-Z]+/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			elseif (preg_match('/SNUGGLE/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}

			// group JNonfic
			elseif (preg_match('/^J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			// for the rare Jnonfic cassette.
			// CDs will be assumed in media section.
			// must have grabbed DVD and VHS nonfic first
			elseif (preg_match('/ J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/JBIO/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/^BRAILLE J/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}

			// group Jfic
			elseif (preg_match('/^J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}
			// grab CDs and Cassettes
			// can't use just '/ J[A-Z]+'/ because
			// it grabs from all genres and callnos
			elseif (preg_match('/(CD|CASSETTE|DIGITAL|L\.P\.|TALKBOOK) J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}

			// group YA
			elseif (preg_match('/^YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}
			elseif (preg_match('/ YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}

			// grab adult nonfic and bio
			// assumes no adult genres begin with E, J, OR Y
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}

			// group adult CDs, CASSETTES, DVDs
			// grabbed CD JAZZ first
			// E, J, YA grabbed already and exited if/then
			// so it should be no problem
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) [A-Z]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) (\*)?[0-9]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}

			// group adult fiction, genre, L.P., N.R., VF
			// assumes no adult genres begin with E, J, OR Y
			// BUT: E, J, YA grabbed already and exited if/then
			// so it should be no problem
			elseif (preg_match('/^((L\.P\.|N\.R\.|VF) )?[A-Z]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^(VF) (\*)?[0-9]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}

// this below to give E/J/YA items only.
elseif ($dept == 'youthserv')
	{
		foreach ($listarr as $key => $value)
		{
			// group E/Jmedia
			// do nothing for CD JAZZ so all later CD J[A-Z] are in JFic
			if (preg_match('/^CD JAZZ/', $listarr[$key]))
			{}
			elseif (preg_match('/(SOFTWARE|DVD|VIDEO|CD) J[0-9]+/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}
			elseif (preg_match('/^(SOFTWARE|DVD|VIDEO) (E|J)/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}

			// group easies
			elseif (preg_match('/^E /', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			// grabs CDs and Cassettes in area
			// must have grabbed E DVD and VHS first
			elseif (preg_match('/^([A-Z]+[ ])?E [A-Z]+/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			elseif (preg_match('/SNUGGLE/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}

			// group JNonfic
			elseif (preg_match('/^J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			// for the rare Jnonfic cassette.
			// CDs will be assumed in media section.
			// must have grabbed DVD and VHS nonfic first
			elseif (preg_match('/ J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/JBIO/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/^BRAILLE J/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}

			// group Jfic
			elseif (preg_match('/^J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}
			// grab CDs and Cassettes
			// can't use just '/ J[A-Z]+'/ because
			// it grabs from all genres and callnos
			elseif (preg_match('/(CD|CASSETTE|DIGITAL|L\.P\.|TALKBOOK) J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}

			// group YA
			elseif (preg_match('/^YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}
			elseif (preg_match('/ YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) [A-Z]/', $listarr[$key]))
			{}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) (\*)?[0-9]/', $listarr[$key]))
			{}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^((L\.P\.|N\.R\.|VF) )?[A-Z]/', $listarr[$key]))
			{}
			elseif (preg_match('/^(VF) (\*)?[0-9]/', $listarr[$key]))
			{}

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}
else // ($dept == 'adultserv')
	{
		foreach ($listarr as $key => $value)
		{

			// CD left out:
			// must test for CD JAZZ first
			if (preg_match('/^((CASSETTE|DVD|VIDEO|L\.P\.|SOFTWARE|DIGITAL|BRAILLE) )?(E|J|YA)/', $listarr[$key]))
			{}
			elseif (preg_match('/^SNUGGLE/', $listarr[$key]))
			{}

			// grab adult nonfic and bio
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}

			// group adult CDs, CASSETTES, DVDs
			// grab CD JAZZ first b/c later range skips it
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^CD JAZZ/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/CD (J|YA)/', $listarr[$key]))
			{}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD) [A-Z]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD) (\*)?[0-9]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}

			// group adult VHS, SOFTWARE, TALKBOOK, V.F.
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [A-D]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [F-I]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [K-X]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) (\*)?[0-9]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}

			// group adult fiction, genre, L.P., N.R.
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[A-D]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[F-I]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[K-X]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}	

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}
// end removals

// clean up detritus so later if/then will work
$detritus = implode("<br /><br />", $detritus);
$detritus = trim($detritus);

	if ($dept == 'youthserv')
	{
		print "<p>HOLDS " . $produced . " // <strong>Media</strong><br />\n";
		print implode("<br /><br />", $jmedia);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Easies</strong>";
		print "<br />" . implode("<br /><br />", $easies);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JFiction</strong>";
		print "<br />" . implode("<br /><br />", $jfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JNonfiction</strong>";
		print "<br />" . implode("<br /><br />", $jnonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>YA</strong>";
		print "<br />" . implode("<br /><br />", $ya);
		print "</p>";
	}
	elseif ($dept == 'adultserv')
	{
		print "<p>HOLDS " . $produced . " // <strong>Nonfiction</strong><br />\n";
		print implode("<br /><br />", $anonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>A/V</strong>";
		print "<br />" . implode("<br /><br />", $av);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Fiction</strong>";
		print "<br />" . implode("<br /><br />", $afic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>VHS</strong>";
		print "<br />" . implode("<br /><br />", $avhs);
		print "</p>";
	}
	else // branch
	{
		print "<p>HOLDS " . $produced . " // <strong>E/J Media</strong><br />\n";
		print implode("<br /><br />", $jmedia);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Easies</strong>";
		print "<br />" . implode("<br /><br />", $easies);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JFiction</strong>";
		print "<br />" . implode("<br /><br />", $jfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JNonfiction</strong>";
		print "<br />" . implode("<br /><br />", $jnonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>YA</strong>";
		print "<br />" . implode("<br /><br />", $ya);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Nonfiction</strong>";
		print "<br />" . implode("<br /><br />", $anonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>A/V</strong>";
		print "<br />" . implode("<br /><br />", $av);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Fiction</strong>";
		print "<br />" . implode("<br /><br />", $afic);
		print "</p>";
	}

if ((!empty($detritus)) && ($detritus != "<br /><br />"))
	{
		print "\n<p><strong>Entries that didn't get grouped:</strong>";
		print "\n<br />The following items were on the holds list but couldn't be placed.";
		print "\n<br />If you're seeing this message it's because there is a bug in the program.";
		print "\n<br /><br />Please copy everything below this line and ";
		print "<a href='mailto:";
print "/* YOUR EMAIL ADDRESS GOES HERE */";
print "?subject=HOLDS " . $produced . " for ";
		if (($dept == 'youthserv') || ($dept == 'adultserv'))
			{print $dept;}
		else {print "branch";}
		print "'>email it to "
print "/* YOUR NAME GOES HERE */"
print "</a>.  Thank you!";
		if (($dept == 'youthserv') || ($dept == 'adultserv'))
			{
			print "<br /><br />\n<em>These items may or may not be part of your list:</em>";
			}
		print "<br />" . $detritus . "</p>";
	}
print "</body>\n</html>";

Big thanks to AskMe for their help in figuring out how to approach this project.

2 Responses to “PSA: automatic formatting of holds lists”

  1. Johnon 02 Apr 2010 at 2:48 pm

    I know the regexes could be much more concise and elegant. It works, it doesn’t put a big load on the server, it’s fast when the network responds quickly, I’m no longer worried about it. ^__^

  2. Johnon 07 Apr 2010 at 8:47 am

    I really hope someone somewhere finds this useful.

Trackback URI | Comments RSS

Leave a comment*
*unless you're a spammer or a troll.