“The Digital Library of Mathematical Functions (DLMF) Project at the National Institute of Standards and Technology (NIST) requires contractor support to improve the semantics and machine-readability of the mathematical content of the online resource as well as to develop a catalog and test-bed of mathematical notations and their associated meanings.
The DLMF (available at http://dlmf.nist.gov) has been online since May 2010 and has experienced widespread acceptance in the scientific and engineering community. In the initial release, the focus was on the usability and informativeness of the website for human readers, specifically, the mathematics is more informally developed and presented.
Attention must now turn to making the information more machine-readable, and available to automated processes. In particular, formulas and other information must be represented in a computable form (Content MathML) with the DLMF constituting a mathematical database. Additionally, DLMF data may be critical in boot-strapping other efforts, such as providing sufficient semantic training data for use in data-mining and machine-learning.”
“Objectives and Scope of Work – The source documents of the DLMF consist of LATEX markup enhanced with ‘semantic’ macros representing many known mathematical concepts as well as declarative markup indicating types, grammatical roles and definitions. When processed with the LaTeXML system, developed for this purpose, this markup is sufficient to generate portable and attractive web content, in particular with readable and accessible mathematics (Presentation MathML) — at least as far as human readers are concerned. However, the current process and enhanced markup is not sufficient to fully disambiguate the mathematical meaning of the formula for computational purposes (Content MathML). Examples of such weaknesses lie in properly interpreting sub- and super-scripts and in associating constraints with formula.
The long-term goal is to represent all of DLMF’s mathematical formula in completely semantic, computable form, ideally by developing automated inference systems to carry out the resolution. This is a significant research effort. The scope of this acquisition is to acquire contractor support to characterize these processing and representation issues, and to develop and implement strategies to resolve them. Since the goal is to make this enhanced information available on the web, knowledge of the relevant standards, as well as participation in standards activities, which enable the representation of such knowledge, is necessary.
Additionally, since the DLMF employs a variety of mathematical notations and symbols, it can provide a seed for catalogs of those notations and symbol definitions. These types of catalogs are useful for test cases in other document processing research, particularly in Mathematical Knowledge Management. The current requirement continues with the collection of notations and meanings towards the development of such test catalogs.”
“Specific Tasks – The Contractor shall provide project leadership programming support to the DLMF project team.
Task 1: Analyze the DLMF data processing model, with emphasis on identifying and categorizing the mathematical ambiguities, semantic weaknesses and lack of computabilty. Based on the initial surveys of the range of ambiguities present in DLMF, focus on characterizing the several interpretations of ambiguous notations with the aim of developing strategies to resolve them including: correlating the encodings of known symbols with OpenMath Content Dictionaries, creating new Dictionaries where required; representation of multiple interpretations of symbols; increasing the coverage of mathematical grammars to capture a significant portion of conventional mathematical notations; and especially methods to infer the intended semantics of symbols and expressions using minimal declarative information, as well as to leverage that information to prune parsing and reduce further ambiguities; discover any third-party software tools that can help in the analysis and/or resolution of these issues.
Task 2: Implement the strategies developed in Task 1 including, as appropriate: software patches to existing tools (LaTeXML, in Perl), increase coverage of mathematical grammars; develop and integrate any required external software tools; develop additional software tools that support the analysis and discovery process.
Task 3: Enhance the catalog of mathematical notations to include a wider range of notation. The catalog is based on the most common DLMF notations; this work shall focus on the less common DLMF notations as well as notations from other references. Particular attention shall be given to both notational and semantic ambiguities in the context of Task 1.
Task 4: Explore and develop parsing grammars and strategies, following on and improving the ambiguous parsing prototypes NIST currently has. Emphasis shall be on representing multiple successful parses, while maximizing use of any available type or signature information (or similar), and especially to prune those parses which are not self-consistent. Implement and apply these strategies to informal documents such as found at arXiv (https://arxiv.org).
Task 5: Participate in relevant World Wide Web (W3C) activities, in particular the Math Working Group, in establishing and propagating standards for the representation and delivery of semantically rich and accessible mathematical information, such as MathML…”
“Period of Performance 06/19/2021 – 06/18/2022”
To register for a free XFactor trial, read more here.