Catmandu::Importer::OAI(3pm) | User Contributed Perl Documentation | Catmandu::Importer::OAI(3pm) |
Catmandu::Importer::OAI - Package that imports OAI-PMH feeds
# From the command line # Harvest records $ catmandu convert OAI --url http://myrepo.org/oai $ catmandu convert OAI --url http://myrepo.org/oai --metadataPrefix didl --handler raw # Harvest repository description $ catmandu convert OAI --url http://myrepo.org/oai --identify 1 # Harvest identifiers $ catmandu convert OAI --url http://myrepo.org/oai --listIdentifiers 1 # Harvest sets $ catmandu convert OAI --url http://myrepo.org/oai --listSets 1 # Harvest metadataFormats $ catmandu convert OAI --url http://myrepo.org/oai --listMetadataFormats 1 # Harvest one record $ catmandu convert OAI --url http://myrepo.org/oai --getRecord 1 --identifier oai:myrepo:1234
Catmandu::Importer::OAI is an Catmandu importer to harvest metadata records from an OAI-PMH endpoint.
Handlers can be provided as function reference, an instance of a Perl package that implements 'parse', or by a package NAME. Package names should be prepended by "+" or prefixed with "Catmandu::Importer::OAI::Parser". E.g "foobar" will create a "Catmandu::Importer::OAI::Parser::foobar" instance.
By default the handler Catmandu::Importer::OAI::Parser::oai_dc is used for metadataPrefix "oai_dc", Catmandu::Importer::OAI::Parser::marcxml for "marcxml", Catmandu::Importer::OAI::Parser::mods for "mods", and Catmandu::Importer::OAI::Parser::struct for other formats. In addition there is Catmandu::Importer::OAI::Parser::raw to return the XML as it is.
Internally the exponential backoff algorithm is used for this. This means that after every failed request the importer will choose a random number between 0 and 2^collision (excluded), and wait that number of seconds. So the actual amount of time before the importer stops can differ:
first retry: wait [ 0..2^1 [ seconds second retry: wait [ 0..2^2 [ seconds third retry: wait [ 0..2^3 [ seconds ..
Every Catmandu::Importer is a Catmandu::Iterable all its methods are inherited. The Catmandu::Importer::OAI methods are not idempotent: OAI-PMH feeds can only be read once.
In addition to methods inherited from Catmandu::Iterable, this module provides the following public methods:
Process an XML DOM as with xslt and handler as configured and return the result.
If you are connected to the internet via a proxy server you need to set the coordinates to this proxy in your environment:
export http_proxy="http://localhost:8080"
If you are connecting to a HTTPS server and don't want to verify the validity of certificates of the peer you can set the PERL_LWP_SSL_VERIFY_HOSTNAME to false in your environment. This maybe required to connect to broken SSL servers:
export PERL_LWP_SSL_VERIFY_HOSTNAME=0
Catmandu , Catmandu::Importer
Nicolas Steenlant, "<nicolas.steenlant at ugent.be>"
Patrick Hochstenbach, "<patrick.hochstenbach at ugent.be>"
Jakob Voss, "<nichtich at cpan.org>"
Nicolas Franck, "<nicolas.franck at ugent.be>"
Copyright 2016 Ghent University Library
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
2023-02-08 | perl v5.36.0 |