●●● the PAULING FILE project
see review article in Chemistry of Metals and Alloys
One of the most challenging tasks in materials science is the design of new materials with tailored properties. Two different approaches are generally explored:
► The first one consists of simulating the motion of the atoms in the material and their electronic interactions by performing ab-initio calculations at the quantum-mechanical level. This approach does (at least in principle) not rely on experiments, but is computationally demanding and can currently only be applied to a limited number of rather simple solids.
► The second approach remains at a more pragmatic level: Most of our current knowledge in materials science has been collected empirically, by searching for patterns in experimental observations. During the past 100 years, huge amounts of data have been collected making it possible to use modern computer technology to search for additional correlations. This approach, however, depends on the availability of a sufficiently large amount of experimental data of appropriate quality.
The shortcomings of the empirical approach provided the basic motivation for the initiation of the PAULING FILE project, which was launched in 1995 as a collaboration between Japan Science and Technology Agency (JST), Material Phases Data System (MPDS), and The University of Tokyo, RACE. Since 2002 the project is under the leadership of MPDS.
The first goal is to create and maintain a comprehensive materials database for non-organic (no C-H bonds) solid state materials, covering phase diagrams, crystallographic data, diffraction patterns, and physical properties. Particular attention should be paid to data checking, since unrecognized errors will confuse the correlation tools. A first edition of the PAULING FILE database, limited to binary systems, was released in 2002 .
●●● materials data mining- discoveries of 'hidden' patterns using the PAULING FILE
Revolutionary developments (Internet, powerful search engines, relational database management systems, etc.) have taken place in the areas of telecommunication and software development - science fiction has become reality. This in contrast to the area of materials data storage where there exist many small databases, with little compatibility, no overall concept, limited continuity, limited financial stability, etc. Here we are still in the stone age. Without drastic changes within the next decades this will produce a bottleneck in materials research.
The PAULING FILE, a comprehensive, fully relational, compatible materials database system, was started as an 'exemplary' remedy to this situation, in 1995. We hope that the application of materials data mining/exploration to the PAULING FILE will lead to the discovery of new patterns that can be applied in materials design.
Several patterns observed within thousands of data sets of different materials have been published. Relatively simple maps showing well-defined domains provide condensed overviews of experimentally determined data and offer some prediction ability. The discoveries show that elemental-property parameters of the constituent chemical elements can be used for the parameterization of intrinsic properties of the compounds.
|For a fisherman, efficient data mining means deducing where he has the highest probability of finding fish, but does not guarantee that he will catch one.|
●● compound formation map
In order to verify our previously published statement: "Structure-sensitive properties of materials can be quantitatively described by elemental-property parameters of the chemical elements"  , we investigated the presence, or absence, of compounds in binary, ternary, and quaternary systems, using experimental data for about 15,500 chemical systems, extracted from over 35,000 publications. The reformulated postulate reads: "Structure-sensitive intrinsic properties of materials can be quantitatively described by the elemental-property parameters Atomic Number (AN) and Periodic Number (PN), or simple mathematical functions of these, of the chemical elements". This generalization is an important step in strategically exploring structure-sensitive intrinsic properties of materials.
By using newly discovered elemental-property parameter expressions as axes in a two- or three-dimensional parameter space, we reached an accuracy of 98% in separating formers and non-formers into distinct domains. The compound formation maps make it possible to predict the existence of compounds in chemical systems that have not yet been investigated.
|Separation of 2,330 binary systems into compound formers (blue) and non-formers (yellow) in a compound formation map showing max[PN(A) / PNmax, PN(B) / PNmax] (y-axis) vs. [PN(A) / PNmax × PN(B) / PNmax] (x-axis), where PN is the Periodic Number (a distinct integer assigned to each chemical element based on its position in Mendeleev's periodic system) .|
●● atomic environment type stability map for AB compounds
The atomic environment types (AETs), also called coordination polyhedra, realized by each chemical element in binary compounds at the equi-atomic composition, were analyzed on a comprehensive set of literature data (about 2,800 binary systems from over 8,000 publications). The Periodic Number (PN) was used to classify the chemical systems. An atomic environment type map using as coordinates the maximum PN vs. the ratio between the minimum and the maximum PN, proved to be convenient to subdivide the chemical systems where different atomic environment types occur in distinct stability domains. The same AET stability map also showed a clear separation between chemical systems where intermediate compounds form and those where no compounds form. The AET stability map makes it possible to predict the existence of AB compounds with a particular atomic environment.
|Atomic environment type (AET) stability map showing the Periodic Number PNmax (y-axis) vs. PNmin / PNmax (x-axis) for equiatomic AB compounds  . AET of the element with the highest Periodic Number is given on the left-hand side of x = 1, AET of the element with the lowest Periodic Number in the same compound on the right-hand side in the same row.|
●● generalized atomic environment type matrix
The atomic environment types (AETs) realized in binary and multinary inorganic compounds were studied by analyzing literature data for 290,000 atom sites from over 50,000 publications. The Periodic Number (PN) was successfully used to classify the AETs in a generalized (PN (central atom) vs. PN (coordinating atoms)) - AET matrix. Chemical elements with PN > 54 (from column VIIB to the right-hand side of the periodic system) control the atomic environment types, regardless of whether they act as central or as coordinating atoms. A generalized AET stability map, using as coordinates PNmax vs. PNmin / PNmax, subdivides the 'central atom - coordinating atoms' combinations so that different atomic environment types occur in distinct 'AET class stability domains'. The same AET matrix (respectively AET stability map) also shows a clear separation between possible and impossible 'central atom - coordinating atoms' combinations. For a chemical element assumed to act as central atom, the AET matrix and AET stability map allow predicting the most probable AET formed by any coordinating chemical elements, regardless of the stoichiometry or number of chemical elements in the compound.
Generalized AET matrix PN(A) vs. PN(B), which is independent of the stoichiometry and the number of chemical elements in the compound . The element A occupying the center of the AET is given on the y-axis and the coordinating element B on the x-axis.
Another recent publication  demonstrates which kind of broad information can be retrieved from the PAULING FILE.
 Villars P., Cenzual K., Daams J.L.C., Hulliger F., Massalski T.B., Okamoto H., Osaki K., Prince A., Iwata S. (Eds.), PAULING FILE, Binaries Edition, Materials Park: ASM International, 2002; Villars P., Berndt M., Brandenburg K., Cenzual K., Daams J., Hulliger F., Massalski T., Okamoto H., Osaki K., Prince A., Putz H., Iwata S. (2004), The PAULING FILE, Binaries Edition, J. Alloys Compd. 367, 293-297.
 Villars P., Brandenburg K., Berndt M., LeClair S., Jackson A., Pao Y.-H., Igelnik B., Oxley M., Bakshi B., Chen P., Iwata S. (2001), Binary, ternary and quaternary compound former/nonformer prediction via Mendeleev number, J. Alloys Compd. 317/318, 26-38.
 Villars P., Daams J., Shikata Y., Rajan K., Iwata S. (2008), A new approach to describe elemental-property parameters, Chem. Met. Alloys 1, 1-23.
 Villars P., Cenzual K., Daams J., Chen Y., Iwata S. (2004), Data-driven atomic environment prediction for binaries using the Mendeleev number: Part 1. Composition AB, J. Alloys Compd. 367, 167-175.
 Villars P., Daams J., Shikata Y., Chen Y., Iwata S. (2008), Data-driven generalized atomic environment prediction for binary and multinary inorganic compounds using the Periodic Number, Chem. Met. Alloys 1, 210-226.
 Villars P., Iwata S. (2013), PAULING FILE verifies / reveals 12 principles in materials science supporting four cornerstones given by Nature, Chem. Met. Alloys 6, 81-108.
 Uhrin M., Pizzi G., Mounet N., Marzari N., Villars P., A high-throughput computational study driven by the AiiDA materials informatics framework and the PAULING FILE as reference database, in Materials Informatics, to be published by John Wiley & Sons.
 Villars P., Cenzual K., Gladyshevskii R., Iwata S. (2018), PAULING FILE - towards a holistic view, Chem. Met. Alloys 11, 43-76.