Search Engine API

Introduction

Exploiting search Engine API of an Invenio based OAR allows you to create, for example, a new front end for your repository. To search or browse records in the OAR you have to just create HTML queries specifing the list of parameters, that allow users to find the records in which they are interested in, so you can present the results in a nicer interface.

Following the syntax should be used to create the queries:

Syntax
GET /search?p=...&of=...&ot=...&jrec=...&rg=...
where

p is pattern (i.e. your query)

of is output format (e.g. xm for MARCXML)

ot is output tags (e.g. `` to get all fields, 100 to get titles only)

jrec is jump to record ID (e.g. 1 for first hit)

rg is records-in-groups-of (e.g. 10 hits per page)

You can use other parameters as well; the list above mentions the most useful one. For full documentation on these and the other /search URL parameters, please see section 3.1 of Search Engine API [1].

XML API

To get results of your queries in XML format you have to set output format parameter to xm (of=xm). The OAR return the results in MARCXML [2].

Get records

So, to get the first 10 records stored in the OAR in XML format you can use the following query

http://nadre.ethernet.edu.et/search?jrec=1&rg=10&of=xm

You can change jrec parameter to implement records pagination, for example the query:

http://nadre.ethernet.edu.et/search?jrec=11&rg=10&of=xm

returns the next 10 records starting from the eleventh, or use the following

http://nadre.ethernet.edu.et/search?jrec=21&rg=10&of=xm

to get records from 21st to 30th, and so on…

Warning

Do not set rg too high; there is a server-wide safety limit on it.

Look for patterns in fields

To get, for example, the first 10 records that contains the string ‘Hackfest’, you can use the p parameter to specify the pattern you are looking for and f parameter to specify the field in which search the patter. See the query below:

http://nadre.ethernet.edu.et/search?p=Hackfest&f=title&jrec=0&rg=10&of=xm
Where

p is pattern (e.g. your query)

f is field to search within (e.g. “title”, “authors”..)

If you want to get, for example, the first 10 records in ‘PRESENTATIONSNADRE’ collection that contains ‘NADRE’ in keyword, you can use:

http://nadre.ethernet.edu.et/search?p1=collection:PRESENTATIONSNADRE+keyword:NADRE&of=xm&jrec=1&rg=10

Filter records and outputs

To get all records uploaded from a given date (e.g. 2018-01-01) to another given date (e.g. 2018-02-22), you can issue:

http://nadre.ethernet.edu.et/search?of=xm&d1=2018-01-01&d2=2018-02-22
Where

d1 is the first date in YYYY-mm-dd format

d2 is the first date in YYYY-mm-dd format

JSON API

Internally, Invenio records are represented in JSON. You can ask for JSON output format, simply, setting of to recjson (of=recjson).

Before proceeding…

You need to have some useful tools used in the rest of this tutorial:
  • curl a tool to transfer data from or to a server link;
  • jq a lightweight and flexible command-line JSON processor link.

Note

If you are not on a *NIX based system, you can use Postman and import this collection files/postman_collection.json to perform the queries.

The following are the same example saw In XML API, but this time results are in JSON format. Just copy the command into shell session and see the outputs.

Get records

Get first ten records

curl -X GET \
"http://nadre.ethernet.edu.et/search?jrec=1&rg=10&of=recjson" \
| jq .

Records from eleventh to twentyth

curl -X GET \
"http://nadre.ethernet.edu.et/search?jrec=11&rg=10&of=recjson" \
| jq .

From 21st to 30th

curl -X GET \
"http://nadre.ethernet.edu.et/search?jrec=21&rg=10&of=recjson" \
| jq .

Look for patterns in fields

Get the first 10 records that contains the string “Hackfest” in the title

curl -X GET \
'http://nadre.ethernet.edu.et/search?p=Hackfest&f=title&jrec=0&rg=10&of=recjson' \
| jq .

Get the first 10 records in ‘PRESENTATIONSNADRE’ collection that contains ‘NADRE’ in keyword

curl -X GET \
'http://nadre.ethernet.edu.et/search?p1=collection:PRESENTATIONSNADRE+keyword:NADRE&of=recjson&jrec=1&rg=10' \
| jq .

Filter records and outputs

Get all records uploaded from a given date (e.g. 2018-01-01) to another given date (e.g. 2018-02-22)

curl -X GET \
'http://nadre.ethernet.edu.et/search?of=recjson&d1=2018-01-01&d2=2018-02-22' \
| jq .

Get only the abstract, title and authors of resources

curl -X GET \
'http://nadre.ethernet.edu.et/search?of=recjson&ot=abstract,title,authors' \
| jq .
[1]http://nadre.ethernet.edu.et/help/hacking/search-engine-api
[2]http://nadre.ethernet.edu.et/help/admin/howto-marc