USDA Sources Sought: Upgrade and Initialization of National Agricultural Library Thesaurus for the Machine Age

Notice ID 1043011

“’Project “NALT for the Machine Age’

This project will complete the upgrade and initialization the NAL Thesaurus (NALT) for the Machine Age and optimize the expertise of its central curation team by facilitating the maintenance of sub-schemes within NALT by external communities of experts to meet their unique needs, and through leveraging Machine Learning and Artificial Intelligence, both to automate subject indexing using NALT and to support the curation of NALT itself. Initiation of the editorial groups and best practices will be developed, documented and coordinated. VocBench software has been developed specifically to support this work. The project will be established and the ongoing editorial and curation will continue as NALT evolves both in terms of technical advances and optimization, management, and content. The goal is for NALT to be state-of-the-art now, and to create the computer and human workflows that will maintain this designation in perpetuity as the technology advances.

The following work will run over the period of performance 9/1/2021 – 9/30/2023 (base plus one option year).

WP2: Automated subject indexing testbed and evaluation

A testbed for subject indexing has been created in the prior contract number 1232SA20-P-0083 by training the open-source platform Annif, from the National Library of Finland (NLF), with NALT and NALT Core, and its results will be compared with the results of existing indexing systems. Following on to this work NAL staff will be trained to use, maintain and support the testbed, and assist NAL patrons partnering with the NALT vocabulary. This SOW will focus on building pipelines for a workflow connecting functionality from vocabulary building, displaying and archiving (from VocBench to SKOSMOS), and to training NAL staff to utilize SKOS files with NAL or other data to build models with Annif, and make them accessible to users.

“The Annif Project at the National Library of Finland has developed an open-source platform for automating the process of subject indexing. When trained using subject vocabularies such as NALT, Annif can analyze the contents of scientific papers using a combination, or “ensemble”, of open-source natural language processing and machine learning algorithms and propose subject headings for approval by expert indexers. The contractor will work with the Indexing and Informatics Branch (IIB) of NAL, ideally in collaboration with NLF, on implementing an Annif testbed at NAL.

  • Dump Extract data sets from NAL systems…
  • Corpus Training or support for NAL staff to use data conversion tools (to Annif corpus format), train Annif models, and perform numeric evaluations…
  • Pipeline VocBench to SKOS outputs, search, files, on SCINet as proof of concept, etc…
  • docs Extract documents for tool comparison…
  • cogito Generate test annotations with Cogito…
  • annif Generate test annotations with Annif…
  • compare Compare Cogito annotations with Annif annotations…”

Read more here.


This topic contains 0 replies, has 1 voice, and was last updated by  Jackie Gilbert 4 days, 11 hours ago.

  • Author
  • #132996

    Replies viewable by members only


You must be logged in to reply to this topic.


Questions?. Send us an email and we'll get back to you, asap.


©2021 MileMarker10, LLC all rights reserved | Community and Member Guidelines | Privacy Policy | About G2Xchange FedCiv

Opportunities. Starting Points.

About our Data

The Vault is a listing of expiring contracts, task orders, etc. within a certain set of parameters, to include:

  • Have an initial total estimated contract value of $10 million or above
  • Federal Civilian Only – DHS, Transportation, Justice, Labor, Interior, Commerce, Energy, State, and Treasury Actions
  • NAICS codes include: 511210, 518210, 519130, 519190, 541511,
    541513, 541519, 541611, 541618,
    541690, 541720, 541990
  • Were modified within the last 12 calendar months
  • The data represented is based on information provided by the government

Who has access? Please note that ALL G2Xchange FedCiv Members will receive access to all basic and much of the advanced data. G2Xchange FedCiv Corporate Members will receive access to ALL Vault content (basic and advanced).

Feedback/Suggestions? Contact us at and let us know what you think. 

G2Xchange FedCiv

Log in with your credentials for G2Xchange FedCiv

Forgot your details?