BIBFRAME

Bibliographic Framework Initiative (Library of Congress)

The Library of Congress > BIBFRAME > Announcements, Resources, and Reports > New BIBFRAME-to-MARC Conversion Tools

Library of Congress BIBFRAME 2.0 Components Available

The Library of Congress announces the availability of new BIBFRAME 2.0 components for converting BIBFRAME data to MARC.  This is the result of work conducted as part of the Library’s BIBFRAME 2.0 cataloging pilot.

Catalogers in the pilot currently input bibliographic metadata twice – once in BIBFRAME and once in MARC.  To reduce this dual entry processing, the Library has been working on a converter that could adequately convert BIBFRAME descriptions to MARC records that could be loaded into the Library’s Integrated Library System (ILS).  This converter is now ready to share with the community to help others carry out development and investigations of the linked data environment using the BIBFRAME 2.0 vocabulary.

BIBFRAME 2.0 to MARC Specifications 

The conversion specifications, available at www.loc.gov/bibframe/bftm, were written by Index Data, under a contract and the direction of the Library’s Network Development and MARC Standards Office.  The specifications are written in a domain specific language in XML for specifying rules to generate MARCXML from RDF/XML that we have labeled “RDF2MARC” conversion language.  The specifications are presented as MS Excel files that are organized by MARC tag ranges. The specifications are based on the MARC to BIBFRAME conversion specifications at www.loc.gov/bibframe/mtbf maintained by the Library of Congress, and track that conversion very closely.

Some MARC elements are rarely used in Library of Congress records, or cannot be generated reliably from BIBFRAME. If this is the case, the Specifications usually say "nac" (no attempt to code) in the conversion column.

The Specifications use the vocabularies and authorities that are in ID.LOC.GOV – Linked Data Service at id.loc.gov as they are resident in the Library’s BIBFRAME descriptions.

The Specifications will be changed as needed as we develop the system for the BIBFRAME 2.0 Pilot.  Revisions to specifications will be indicated in the file name and the URI on the document will indicate a revision (e.g., v1.0, v1.1, v1.2, …).  Changes in a new version will be color coded.

BIBFRAME to MARC Conversion Programs

The Programs for converting the BIBFRAME data to MARC were also written for the Library of Congress by Index Data.  They are written in XSLT and are available for download on the Library of Congress Github site at github.com/lcnetdev/bibframe2marc.   The Library is currently working with these conversions in its Pilot project and we expect that they will be adjusted as work progresses.  Adjustments by the Library of Congress will always be in step with adjustment of the Specifications described above.  

Index Data has also written a Perl library as a wrapper for the bibframe2marc XSLT application, with a command-line tool for batch processing. Source code and documentation for the Perl library is at github.com/lcnetdev/biblio-bf2marc, and it will also be published in CPAN as Biblio::BF2MARC.

BIBFRAME 2.0 Tools 

A conversion tool is also available to enable viewing data converted from BIBFRAME to MARC at id.loc.gov/tools/bibframe/comparebf. This tool takes an LCCN or a Library of Congress BIB ID and shows the BIBFRAME for it on the left, using the latest conversion in metaproxy. On the right is a converted MARC record, using the latest bibframe2marc XSLT stylesheet. The record can be shown in MARC text format or MARCXML.

MARC Conventions used in the Conversions

In the BIBFRAME to MARC conversion it was occasionally necessary to make choices in the conversion.  Also the Library of Congress makes extensive use of URIs in BIBFRAME data and wished to avoid the loss of these URIs in the MARC version of a description.    The following conventions were followed.

  • The 008 and 007/00 and /01 are converted but they are also duplicated in other places in the format where a URI for the value could be recorded.
  • For data where MARC may have multiple locations for data, only one was usually chosen.
  • For data where MARC allows options, choices had to be made.  For example, Model B was selected for records containing non-Latin data rather than the Model A (www.loc.gov/marc/bibliographic/ecbdmulti.html).    Thus the 880 field is not used in the records.  This structure matches MARC Authority records that use Model B.  Non-Latin data will appear in regular fields and there will be less transliteration of non-Latin data. 
  • For LCSH subject headings the URI for the whole string precedes the string and the URI for a component follows the component it applies to.
  • Punctuation at subfield boundaries will not be inserted if it is not carried in the corresponding BIBFRAME element.
  • URIs are carried in the MARC $0 subfields.