PSA: automatic formatting of holds lists

April 2nd, 2010

Some time back I read an article talking about how libraries all over the country were individually recreating the wheel, duplicating effort instead of building on someone else’s. There were various reasons for this mentioned: that some librarians didn’t want to feel obligated to maintain code put out in the public, that some were embarrassed about code they considered subpar, that some workplaces didn’t allow for that sort of sharing, etc.

In Florida, the work done by state and county employees, as well as municipalities, is all considered in the public domain. So in a spirit of sharing, in the hopes it will help someone outside our district, here’s this code I wrote in PHP to format holds lists generated by a well-known interface to library catalogs.

The code is far from perfect, but on my laptop–with my laptop doing the processing, not sending the list out and waiting for a response–a 100-page report gets accepted, processed, and returned streamline in literally just a few seconds. So I’m considering it good enough to call “done” and move on.

The script can of course be modified or extended to fit your own needs, according to your own departmental breakdown and/or stack arrangement. The important thing is that I think it can give a starting point on how to approach the problem of overly verbose lists (it turns a 102-page report into a 66-page one).

With those caveats:


<?php
// check whether the form's been filled out
$list = ($_POST['list']);
$dept = ($_POST['dept']);

// removing extra line breaks at beginning and end
$list = trim($list);

// checking to see if there's actually a list
if (empty($list))
	{
	header("Location: workflows.php?status=notext");
	}

else {
	print "<html>\n<head>\n";
	print "<style type=\"text/css\">";
	print "body {\n    font-family:'Arial', 'Helvetica', sans-serif }";
	print "</style>";
	if ( (empty($_POST['dept'])) || (!isset($_POST['dept'])) )
		{print "<title>Holds list: All</title>";}
	elseif ($dept == 'youthserv')
		{print "<title>Holds list: Youth Services</title>";}
	else // Adult Services
		{print "<title>Holds list: Adult Services</title>";}
	print "</head>\n<body>\n";
	}

$jmedia = array();
$easies = array();
$jnonfic = array();
$jfic = array();
$ya = array();

$anonfic = array();
$av = array();
$afic = array();
$avhs = array();

$detritus = array();

// removing extra spaces throughout
$list = preg_replace('/( ){2,}/', ' ', $list);

// remove repeating header
// remove HOLD PICKUP LIST
$orig = "/HOLD PICKUP LIST/";
$list = preg_replace($orig, "", $list);

// remove Production date
$orig = "/Produced ([A-Za-z]{3}) ([A-Za-z]{3}) ([0-9]{1,2}) [0-9]{2}:[0-9]{2}:[0-9]{2} ([0-9]{4})/";
// first try to store this as a variable for later output
preg_match($orig, $list, $matches);
unset ($matches[0]);
$produced = implode(" ", $matches);

// now remove Production date from header
$list = preg_replace($orig, "", $list);

// remove Production library
$orig = "/Library\:[ ]+[A-Za-z]+(\-[A-Za-z]+)?/";
$list = preg_replace($orig, "", $list);

// Converting line breaks
$list = preg_replace("/( )*\n+/", "<br />", $list);

// Removing extra line breaks
$list = preg_replace("/((<br \/>)+\s+(<br \/>)+\s*)+/", "<br /><br />", $list);
$list = preg_replace("/<br \/><br \/><br \/>/", "<br /><br />", $list);

// to get those stray extra line breaks (tried this other ways. no go.)
$list = preg_replace("/<br \/><br \/>[ ]+/", "<br />", $list);

// Removing copy number
$list = preg_replace("/copy\:[0-9]{0,4}/", "", $list);

// Removing type
$list = preg_replace("/type\:[A-Z]{3,}(\-[A-Z]{2,})?/", "", $list);

// Removing location
$list = preg_replace("/location\:[A-Z]{4,}\s{0,}<br \/>\s{0,}/", "", $list);

// Removing pickup library
$list = preg_replace("/Pickup library\:[A-Z]{3,}(\-[A-Z]{2,})?/", "", $list);

// removing line break between callno. and author
$callno = '/(<br \/><br \/>)([A-Z0-9\.\*\,#&-]+)([ ][A-Z0-9\.\*\,#&-]+)([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?([ ][A-Z0-9\.\*\,#&-]+)?\s?<br \/>/';
$replacement = '$1$2$3$4$5$6$7$8 / ';
$list = preg_replace($callno, $replacement, $list);

$listarr = explode("<br /><br />", $list);

// first 'if' here does branch holds
if ( (empty($_POST['dept'])) || (!isset($_POST['dept'])) ) // branch
	{
		foreach ($listarr as $key => $value)
		{
			// group jmedia
			// send off CD JAZZ so all later CD J[A-Z] are in JFic
			if (preg_match('/^CD JAZZ/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/(SOFTWARE|DVD|VIDEO|CD) J[0-9]+/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}
			elseif (preg_match('/^(SOFTWARE|DVD|VIDEO) (E|J)/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}

			// group easies
			elseif (preg_match('/^E /', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			// grabs CDs and Cassettes in area
			// must have grabbed E DVD and VHS first
			elseif (preg_match('/^([A-Z]+[ ])?E [A-Z]+/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			elseif (preg_match('/SNUGGLE/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}

			// group JNonfic
			elseif (preg_match('/^J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			// for the rare Jnonfic cassette.
			// CDs will be assumed in media section.
			// must have grabbed DVD and VHS nonfic first
			elseif (preg_match('/ J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/JBIO/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/^BRAILLE J/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}

			// group Jfic
			elseif (preg_match('/^J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}
			// grab CDs and Cassettes
			// can't use just '/ J[A-Z]+'/ because
			// it grabs from all genres and callnos
			elseif (preg_match('/(CD|CASSETTE|DIGITAL|L\.P\.|TALKBOOK) J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}

			// group YA
			elseif (preg_match('/^YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}
			elseif (preg_match('/ YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}

			// grab adult nonfic and bio
			// assumes no adult genres begin with E, J, OR Y
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}

			// group adult CDs, CASSETTES, DVDs
			// grabbed CD JAZZ first
			// E, J, YA grabbed already and exited if/then
			// so it should be no problem
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) [A-Z]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) (\*)?[0-9]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}

			// group adult fiction, genre, L.P., N.R., VF
			// assumes no adult genres begin with E, J, OR Y
			// BUT: E, J, YA grabbed already and exited if/then
			// so it should be no problem
			elseif (preg_match('/^((L\.P\.|N\.R\.|VF) )?[A-Z]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^(VF) (\*)?[0-9]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}

// this below to give E/J/YA items only.
elseif ($dept == 'youthserv')
	{
		foreach ($listarr as $key => $value)
		{
			// group E/Jmedia
			// do nothing for CD JAZZ so all later CD J[A-Z] are in JFic
			if (preg_match('/^CD JAZZ/', $listarr[$key]))
			{}
			elseif (preg_match('/(SOFTWARE|DVD|VIDEO|CD) J[0-9]+/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}
			elseif (preg_match('/^(SOFTWARE|DVD|VIDEO) (E|J)/', $listarr[$key]))
			{array_push($jmedia, $listarr[$key]);}

			// group easies
			elseif (preg_match('/^E /', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			// grabs CDs and Cassettes in area
			// must have grabbed E DVD and VHS first
			elseif (preg_match('/^([A-Z]+[ ])?E [A-Z]+/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}
			elseif (preg_match('/SNUGGLE/', $listarr[$key]))
			{array_push($easies, $listarr[$key]);}

			// group JNonfic
			elseif (preg_match('/^J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			// for the rare Jnonfic cassette.
			// CDs will be assumed in media section.
			// must have grabbed DVD and VHS nonfic first
			elseif (preg_match('/ J[0-9]+/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/JBIO/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}
			elseif (preg_match('/^BRAILLE J/', $listarr[$key]))
			{array_push($jnonfic, $listarr[$key]);}

			// group Jfic
			elseif (preg_match('/^J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}
			// grab CDs and Cassettes
			// can't use just '/ J[A-Z]+'/ because
			// it grabs from all genres and callnos
			elseif (preg_match('/(CD|CASSETTE|DIGITAL|L\.P\.|TALKBOOK) J[A-Z]+/', $listarr[$key]))
			{array_push($jfic, $listarr[$key]);}

			// group YA
			elseif (preg_match('/^YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}
			elseif (preg_match('/ YA /', $listarr[$key]))
			{array_push($ya, $listarr[$key]);}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) [A-Z]/', $listarr[$key]))
			{}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD|VIDEO|SOFTWARE|TALKBOOK) (\*)?[0-9]/', $listarr[$key]))
			{}

			// officially do nothing for these, so $detritus will work
			elseif (preg_match('/^((L\.P\.|N\.R\.|VF) )?[A-Z]/', $listarr[$key]))
			{}
			elseif (preg_match('/^(VF) (\*)?[0-9]/', $listarr[$key]))
			{}

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}
else // ($dept == 'adultserv')
	{
		foreach ($listarr as $key => $value)
		{

			// CD left out:
			// must test for CD JAZZ first
			if (preg_match('/^((CASSETTE|DVD|VIDEO|L\.P\.|SOFTWARE|DIGITAL|BRAILLE) )?(E|J|YA)/', $listarr[$key]))
			{}
			elseif (preg_match('/^SNUGGLE/', $listarr[$key]))
			{}

			// grab adult nonfic and bio
			elseif (preg_match('/^(\*)?BIO/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}
			elseif (preg_match('/^(\*)?[0-9]/', $listarr[$key]))
			{array_push($anonfic, $listarr[$key]);}

			// group adult CDs, CASSETTES, DVDs
			// grab CD JAZZ first b/c later range skips it
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^CD JAZZ/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/CD (J|YA)/', $listarr[$key]))
			{}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD) [A-Z]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}
			elseif (preg_match('/^(CD|CASSETTE|DIGITAL|DVD) (\*)?[0-9]/', $listarr[$key]))
			{array_push($av, $listarr[$key]);}

			// group adult VHS, SOFTWARE, TALKBOOK, V.F.
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [A-D]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [F-I]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) [K-X]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}
			elseif (preg_match('/^(VIDEO|SOFTWARE|TALKBOOK|VF) (\*)?[0-9]/', $listarr[$key]))
			{array_push($avhs, $listarr[$key]);}

			// group adult fiction, genre, L.P., N.R.
			// must have seen E, J, and Y first and exited if/then
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[A-D]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[F-I]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}
			elseif (preg_match('/^((L\.P\.|N\.R\.) )?[K-X]/', $listarr[$key]))
			{array_push($afic, $listarr[$key]);}	

			// things not grouped elsewhere
			else
			{array_push($detritus, $listarr[$key]);}
		}
	}
// end removals

// clean up detritus so later if/then will work
$detritus = implode("<br /><br />", $detritus);
$detritus = trim($detritus);

	if ($dept == 'youthserv')
	{
		print "<p>HOLDS " . $produced . " // <strong>Media</strong><br />\n";
		print implode("<br /><br />", $jmedia);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Easies</strong>";
		print "<br />" . implode("<br /><br />", $easies);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JFiction</strong>";
		print "<br />" . implode("<br /><br />", $jfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JNonfiction</strong>";
		print "<br />" . implode("<br /><br />", $jnonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>YA</strong>";
		print "<br />" . implode("<br /><br />", $ya);
		print "</p>";
	}
	elseif ($dept == 'adultserv')
	{
		print "<p>HOLDS " . $produced . " // <strong>Nonfiction</strong><br />\n";
		print implode("<br /><br />", $anonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>A/V</strong>";
		print "<br />" . implode("<br /><br />", $av);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Fiction</strong>";
		print "<br />" . implode("<br /><br />", $afic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>VHS</strong>";
		print "<br />" . implode("<br /><br />", $avhs);
		print "</p>";
	}
	else // branch
	{
		print "<p>HOLDS " . $produced . " // <strong>E/J Media</strong><br />\n";
		print implode("<br /><br />", $jmedia);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Easies</strong>";
		print "<br />" . implode("<br /><br />", $easies);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JFiction</strong>";
		print "<br />" . implode("<br /><br />", $jfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>JNonfiction</strong>";
		print "<br />" . implode("<br /><br />", $jnonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>YA</strong>";
		print "<br />" . implode("<br /><br />", $ya);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Nonfiction</strong>";
		print "<br />" . implode("<br /><br />", $anonfic);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>A/V</strong>";
		print "<br />" . implode("<br /><br />", $av);
		print "\n<br /><br />HOLDS " . $produced . " // <strong>Fiction</strong>";
		print "<br />" . implode("<br /><br />", $afic);
		print "</p>";
	}

if ((!empty($detritus)) && ($detritus != "<br /><br />"))
	{
		print "\n<p><strong>Entries that didn't get grouped:</strong>";
		print "\n<br />The following items were on the holds list but couldn't be placed.";
		print "\n<br />If you're seeing this message it's because there is a bug in the program.";
		print "\n<br /><br />Please copy everything below this line and ";
		print "<a href='mailto:";
print "/* YOUR EMAIL ADDRESS GOES HERE */";
print "?subject=HOLDS " . $produced . " for ";
		if (($dept == 'youthserv') || ($dept == 'adultserv'))
			{print $dept;}
		else {print "branch";}
		print "'>email it to "
print "/* YOUR NAME GOES HERE */"
print "</a>.  Thank you!";
		if (($dept == 'youthserv') || ($dept == 'adultserv'))
			{
			print "<br /><br />\n<em>These items may or may not be part of your list:</em>";
			}
		print "<br />" . $detritus . "</p>";
	}
print "</body>\n</html>";

Big thanks to AskMe for their help in figuring out how to approach this project.

reading: 2009

January 27th, 2010

Way back in my undergrad days, when I was incredibly depressed and rudderless and desperate for some sense of meaning or achievement, I started keeping track of what books I read and what movies I watch.

It’s something I’ve stuck with because the items on the list turn out to be convenient fenceposts along my memory: I watched this with her; I saw this one right after I moved; I watched this one abroad with family, etc.

At any rate, as a librarian it’s worth some reflection on the books I’ve read.

Death Note vol. 5 – 12
Surely You’re Joking, Mr. Feynman
The Adventures of Barry Ween, Boy Genius: 2.0
Skim
The King of Mulberry Street
Deogratias
I Know Why the Caged Bird Sings
Good As Lily
Bound by Law?
The Complete Concrete
The Book of Lists: Horror
Squirrel Mother
The Bloody Streets of Paris (graphic novel, not prose)
The Pride of Baghdad
Stray Bullets v. 1
Dignifying Science
Queen Bee
Fax from Sarajevo
Aya
Nation
Notes for a War Story
Mom’s Cancer
Sentences
Into the Volcano
Astro City: Life in the Big City
Ghost World
Artemis Fowl (graphic novel)
Rapunzel’s Revenge
Yossel. April 1943
Y: the Last Man, v. 1 -10
Dogs & Water
The Eternals
Blue Pills
Same Difference & Other Stories
Artemis Fowl (prose)
Artemis Fowl: The Arctic Incident
Artemis Fowl: The Eternity Code
Artemis Fowl: The Opal Deception
Superman for All Seasons
Tales from the Brothers Grimm (graphic novel adapation)
Artemis Fowl: The Lost Colony
Why I Killed Peter
Artemis Fowl: The Time Paradox
Invincible trade paperback: v.1 – 11 (and, later, Ultimate Invincibles 1-4)
Jellaby: Monster in the City
Fade
Queen & Country: Operation Broken Ground
Swallow Me Whole
Chicken with Plums
Don’t Look Behind You
The Graveyard Book
The Artemis Fowl Files
Street Angel
Blueberry Girl
Crazy Hair
House
Debbie Harry Sings in French
Grammar of the Shot
Hatter M: v. 1 in the Looking Glass Wars (graphic novel)
Creatures of the Night
The Facts in the Case of the Disappearance of Miss Finch
A Wrinkle in Time
A Wind in the Door
The Book of Genesis Illustrated by R. Crumb
The Strain
The Beast of Chicago: the Murderous Career of H.H. Holmes
The Secret Science Alliance and the Copy Cat Crook
Johnny Hiro
G-Man v. 1: Learning to Fly
Diary of a Wimpy Kid: Greg Heffley’s Journal
Diary of a Wimpy Kid: Rodrick Rules
Daredevil: Echo / Vision Quest
Asterios Polyp
The Boy Who Harnessed the Wind
Monster
Little Brother
Smax
Animal Man
The Professor’s Daughter
The Goon: Chinatown
Channel Zero
Desperadoes: a Moment’s Silence

When I read Nation in May I thought it would be the best book I read all year–and it was, until December, when I read The Boy Who Harnessed the Wind, which turned out to be one of the best books I’ve read ever.

On this list, it’s worth noting the works which hinge on older technology or culture in a way that serves as a detriment to the story: the Complete Concrete suffers from its cultural references, many to people younger kids wouldn’t know; Johnny Hiro is moving in that direction; the plot of Channel Zero relies on obsolete technology; and while I enjoyed Little Brother, I expect that in twenty years it will look quaint.

Smax and Animal Man were both terrible books by good comics authors, and from the Neil Gaiman books I read this year–all of which struck me as slight, including The Graveyard Book, which won the Newbery–I’d like to say that I’m done with his work. But I’ll probably keep reading them as they come out, in vain hope that they’ll be worth it.

The Strain was a terrible book co-written by a decent director (Guillermo del Toro).

Diary of a Wimpy Kid didn’t amuse me, at all. To me, it just seemed like the same joke over and over (look: Greg Heffley is self-centered and oblivious!) and I chose not to read the next two books although I’d already bought them. The library was happy to have them.

The Artemis Fowl books are good individually, with probably The Opal Deception the strongest among them, but reading them all within a short span makes their weaknesses show: each one depends entirely too much on that same tired ace up Colfer’s sleeve.

Y: The Last Man was very good, in spite of some spots that strained suspension of disbelief, but the ending was crap. It, like Asterios Polyp, works much better if you ignore something big that happens at the end (except with Y there are two of those somethings, one pointlessly cruel and the other frankly impossible).

Invincible was another good find. It’s a rare treat to find a fresh voice in superhero comics which isn’t in the vein of Alan Moore or Frank Miller, too macho to show warmth and too cynical to show hope.

Of all these, it looks like less than 20% were written or co-written by women authors (and since most of the books here are comics, it just reinforces the notion–correct, in this case–that comics are a boyzone). Interestingly, I liked 3/4 of the books by women, which has the men beat by far.

The memoris ranged from good to very good. The non-memoir non-fiction books were more uneven. And I still don’t know what to make of Crumb’s Genesis, except that I wanted to like it more than I did.