Fyllo → API

API for the indexer

This is how the indexer calls Fyllo to get values for inclusion in the portal. It is not a public API but it is good to have an appreciation of the underlying mechanism.

Remember that Fyllo doesn't know anything about the classification of the names it tracks. Actually it knows nothing about the names either! It just stores the WFO IDs and calls to the public GraphQL API to render a human readable version. The Taxa page likewise just calls the GraphQL API to calculate the values that would be stored in the index for a particular taxon. This is why it displays the classification used at the top of the page.

Get modified names

Calling this URL with method GET and the parameter 'offset' will return a JSON array containing the WFO IDs of the 1,000 names with most recently changed values, and their modification date, in descending order. In order to perform a delta update a client can call api.php?offset=0 followed by api.php?offset=1000 etc until it reaches its last sync date or the supply stops.

Fetch values for taxon trees

POSTing to this URL with the body containing a JSON array of objects describing taxon graphs will return the index values for each of the taxon graphs in the array. There is a limit of 1,000 objects in each call.

A taxon graph is simple structure. Each name is represented by an array of WFO IDs. Nearly always this will contain a single WFO ID but sometimes it will contain multiple IDs when there has been deduplication of records. Fyllo won't know about this because it just gets given data tagged with WFO IDs and those might be the IDs for a name record that has been merged into another.

[
    {
        "classification": "9999-01",
        "taxon": [
            "wfo-0000632146"
        ],
        "path": [
            [
                "wfo-9971000003"
            ],
            [
                "wfo-4100001250"
            ],
            [
                "wfo-4100003335"
            ],
            [
                "wfo-9949999999",
                "wfo-9499999999"
            ],
            [
                "wfo-9000000022"
            ],
            [
                "wfo-7000000036"
            ],
            [
                "wfo-4000010286"
            ]
        ],
        "synonyms": [
            [
                "wfo-0000431439"
            ],
            [
                "wfo-0000540588"
            ],
            [
                "wfo-0000540624"
            ],
            [
                "wfo-0000540650"
            ],
            [
                "wfo-0000540651"
            ],
            [
                "wfo-0000540653"
            ]
        ]
    }
]

The return structure is similar to that required to update a SOLR index.

[
    {
        "taxon": "wfo-0000632146",
        "classification": "9999-01",
        "wfo-f-2_ss": [
            "wfo-fv-52",
            "wfo-fv-72",
            "wfo-fv-182"
        ],
        "wfo-fv-52_provenance_ss": [
            "wfo-0000632146-s-60-direct",
            "wfo-0000632146-s-64-direct",
            "wfo-0000632146-s-65-direct"
        ],
        "wfo-f-2_t": "Countries (ISO) :  Chile [CL] Ecuador [EC] Peru [PE]",
        "wfo-fv-72_provenance_ss": [
            "wfo-0000632146-s-87-direct"
        ],
        "wfo-fv-182_provenance_ss": [
            "wfo-0000632146-s-192-direct"
        ],
        "wfo-f-8_ss": [
            "wfo-fv-407",
            "wfo-fv-409",
            "wfo-fv-453",
            "wfo-fv-489",
            "wfo-fv-601"
        ],
        "wfo-fv-407_provenance_ss": [
            "wfo-0000632146-s-1109-direct"
        ],
        "wfo-f-8_t": "TDWG Botanical Area :  Chile Central Chile North Gal Juan Fern Peru",
        "wfo-fv-409_provenance_ss": [
            "wfo-0000632146-s-1111-direct"
        ],
        "wfo-fv-453_provenance_ss": [
            "wfo-0000632146-s-1155-direct"
        ],
        "wfo-fv-489_provenance_ss": [
            "wfo-0000632146-s-1191-direct"
        ],
        "wfo-fv-601_provenance_ss": [
            "wfo-0000632146-s-1303-direct"
        ],
        "wfo-f-5_ss": [
            "wfo-fv-1887"
        ],
        "wfo-fv-1887_provenance_ss": [
            "wfo-0000632146-s-1554-direct",
            "wfo-0000540650-s-1554-synonym"
        ],
        "wfo-f-5_t": "Life Form :  Annual",
        "snippet_text_categories_ss": [
            "link-out"
        ],
        "snippet_text_languages_ss": [
            "zzz"
        ],
        "snippet_text_name_ids_ss": [
            "wfo-0000632146"
        ],
        "snippet_text_ids_ss": [
            "wfo-snippet-21500"
        ],
        "snippet_text_sources_ss": [
            "wfo-ss-1803"
        ],
        "snippet_text_bodies_txt": [
            "https:\/\/www.ncbi.nlm.nih.gov\/Taxonomy\/Browser\/wwwtax.cgi?id=3026891"
        ]
    }
]

Facet and Source metadata

The metadata for the Facets and Sources are stored as separate documents in the index, not in every taxon record. There is therefore a call to retrieve this data for update. It is done as a single call, no paging, as it shouldn't get that big.

api.php?metadata=facets

api.php?metadata=sources

Scores and Snippets

We also need to add the extended metadata for all the Facet Value scores and the Snippets. These are done with the following calls. They accept a "since" parameter as a Unix time stamp so you don't need to re-index everything everytime. A maximum of 1,000 results are returned ordered by modified date. You can page up to the current date by simply calling again with since the last modified date. (You'll need to convert it to a timestamp.)

api.php?metadata=scores&since=1774880981

api.php?metadata=snippets&since=1774880981

Authentication & authorisation

These API calls can be expensive to serve. We don't want a bot to get in here and start scraping stuff and so all calls require a key value in the header to be processed. Keys are manually configured and stored in the configuration file. There is a test script in the code that shows how the bearer token can be passed.