During neutron or muon experiments, and on many other occasions, researchers can be confronted with unusual or unexpected results. Being able to understand what they are seeing quickly would mean they could adapt their experiment before their beamtime finishes, making the best use of this valuable time.
In two studies, both published in Scientific Data, researchers from ISIS and the University of Cambridge have created databases that will be able to help in this situation. Professor Jacqui Cole and her team used their artificial intelligence (AI) software, ChemDataExtractor, to mine the scientific literature for materials and property information and collate it to auto-generate very large experimental materials databases for the scientific community.
The studies focus on the areas of semiconductors and optical materials research. In one, Jacqui and Qingyang Dong used data reported in over 125,000 journal articles to create a database of semiconducting materials and their band gaps.
In the other study, Jiuyang Zhao took information from over 375,000 articles and created a database of almost 50,000 refractive index and over 60,000 dielectric constant data records on 11,054 unique chemicals.
By collating this information into their databases, Jacqui, Qingyang and Jiuyang have created a valuable resource that ISIS users can use to understand structure and property trends about materials much faster. Being able to access this information quickly will enable users to understand their results while they are still at ISIS, enabling them to adapt their experiment if needed.
“Given the strength of the ISIS user community in semiconductors and optical materials research within its functional materials portfolio, the hope is that such databases provide a large group of the ISIS user community with 'data at their fingertips' about materials and properties that are relevant to their research," explains Jacqui.
She adds; “In a wider sense, the availability of such large datasets of experimental data on topics for core materials physics have been notably lacking and yet are highly sought after, especially for the machine learning community where data scientists wish to mine them for data-driven materials discovery."
Although similar computationally generated databases are readily sourced via high-throughput calculations, they are not experimentally validated. Therefore, having quick access to this large quantity of experimental data is hugely valuable.
These two studies are part of a larger project on developing data-science platforms for ISIS that Jacqui is leading as part of her joint appointment between ISIS and the University of Cambridge.
Further information
The two papers can be found online at DOI: 10.1038/s41597-022-01295-5 and DOI: 10.1038/s41597-022-01294-6
The databases can be accessed at http://www.opticalmaterials.org/ and https://doi.org/10.6084/m9.figshare.14079863.