January 17, 2019

Fascinating ISBNs

After 3 years of successfully overlaying MARC records using a match on field 020 ISBN in Koha I found a file that persistently refused to overlay despite that there was nothing apparently wrong in the raw data.

Having an aggressive ISBN and ISSN matching enabled, which allows for these identifiers to be matched regardless of the presence of hyphens in them, or any other typographical symbols, also means that invalid ISBNs and ISSNs are disregarded in the process of matching. So how to determine if an ISBN is valid?

Well, as per the latest ISO 2108:2017, the 13-digit ISBN contains five groups of numbers : GS1 prefix, Registration group element, Registrant element, Publication element, Check digit. These are explained well in detail by Bill Pearce. The crux seems to be the last element, the check digit, which is always one character, calculated from the other nine digits of the ISBN and used to check the validity of the ISBN. The calculation performed for producing the check digit, uses the ISBN digits from left to right and multiplies each of them by 1 and 3 iteratively. The products of the multiplication are added together and the resultant number is divided by 10. Then that number (if 0 remains 0, if other) is subtracted from 10, and the difference is the check digit. To validate an ISBN you can use the same calculation in reverse, resulting in zero for a valid ISBN.

Fun fact: The 3-digit at the beginning of the ISBN is adopted from the EAN system to denote country of origin. For example 978 refers to the fictitious country of Bookland, and with the expansion of that sequence, 979, former Musicland, is also used for books.

Validity, however, was not an issue for the ISBNs I was working with (all Springer titles: 9789811312670, 9789811312618, 9789811312649)… yet matching on 020 field was still unsuccessful. This meant that some Koha normalization rule that kicks in checking upon import was preventing the match. In fact, it later turned out, as our support company discovered, that Koha was using an outdated reference which identified the publisher as invalid. Who knew (especially as this doesn’t ѕeem to be documented at all by the Koha community) that there’s such a reference file with registration groups and registrant elements.

© 2019 Miglena Minkova