A very straightforward example ofan 'ecological informatics' application is the Fish-based Decision Support System (FIDESS): a 'decision support system' (DSS) that has been recently developed in Italy is based on artificial intelligence and aims at assisting environmental management policies.
The need for such a DSS stemmed from the European Water Framework Directive (WFD), which set a very ambitious goal for all the member states, that is, improving the quality of all the superficial water bodies by 2015 up to a level that can be considered as 'good'. Obviously, in order to enforce the WFD policies accordingly, appropriate evaluation methods are required. The WFD clearly states that the key criterion is the 'ecological status', that is, an expression of the quality of the structure and functioning of aquatic ecosystems associated with surface waters, which is mainly based on biotic 'quality elements'. Fish fauna plays a major role among the latter, not only because fish species are effective biological indicators of environmental quality in aquatic ecosystems, but also because of their iconic value.
The majority of the available assessment methods based on fish have been developed during the last two decades, and they are mostly inspired by the seminal work by J. Karr, who developed a multimetric index (the 'index of biotic integrity', IBI), which combines 12 attributes of the fish assemblage that are supposed to respond to environmental disturbance (i.e., metrics) into a single score. This approach is inherently flexible, and therefore it has been adapted to a number of countries and river basins, not only in North America, but also in Europe and other continents.
Although multimetric biotic indices have become commonplace tools in environmental management, they are not optimized from a computational point of view and therefore even the most successful ones often fail, providing evaluations that are not consistent with other ecological evidences. This limited capability is not surprising, as no evaluation method can be simple, general, and accurate at the same time. Multimetric indices are certainly simple, so they have to give up generality in order to be accurate, and in fact the most successful ones are usually aimed at a single river basin or at a single, very homogeneous ecoregion. Basically, multimetric biotic indices usually rely upon a sound ecological rationale, but they exploit the available information in a suboptimal way.
In order to be both general and accurate, methods for evaluating the ecological status must be more complex than multimetric indices in the way they process the available information. 'Ecological informatics' is certainly the appropriate conceptual and methodological framework for developing such an optimized method.
Therefore, a DSS based on an ANN was trained to associate fuzzy expert judgments to environmental and fish assemblage data. This solution was based on the assumption that complex biotic relationships that link fish assemblage composition to environmental conditions can be embedded into an ANN and that such an ANN can be trained to mimic the way human experts issue their judgments.
In fact, expert judgment, although inherently subjective, is the key for any environmental assessment method, from the selection ofrelevant metrics to the discretization of the scoring scales of multimetric indices. The same subjectivity affects the evaluation of the ecological status, which cannot be univocally defined, and it is mostly based on the personal interpretation of natural phenomenologies. In spite of the lack of objective criteria, ecologists usually agree in ranking sites according to their ecological status, because they share a common rationale.
FIDESS is still under development, as more information (fish assemblage data, environmental data, and multiple expert judgements) is needed to fully train the ANN with respect to a full spectrum of ecoregional conditions, and at present it is optimized for central Italy.
In spite of theoretical problems related with the so-called 'curse of dimensionality' and thanks to the strong biotic relationships that implicitly constrained the learning phase, a few hundreds records allowed to properly train a very complex ANN. (The curse of dimensionality refers to the exponential increase in volume caused by the addition of new dimensions to an »-dimensional space. In machine learning applications it usually hinders the solution of problems involving a limited number of patterns in a high-dimensional feature space.) This ANN is a 59-25-5 MLP, which has 27 abiotic and 32 biotic inputs. Among the latter, several hydromorphological attributes as well as some chemicophysical ones are considered, while presence/absence of 30 species, plus overall and juveniles-only species richness were included as descriptors of the fish assemblage. The ANN has five outputs, which correspond to fuzzy membership estimates relative to each one of the ecological status classes that are defined by the European WFD (and that are considered in the human expert judgments). The ANN outputs can be regarded as memberships as they sum up to one thanks to a softmax activation function in the output nodes.
The training of the ANN-based DSS is performed not only using data directly obtained from sampling, but also 'virtual' records. Basically, during expert judgment elicita-tion, human experts are also asked to point out which changes in biotic and/or abiotic would affect their evaluation, or to explain how their evaluation would change in case different (but likely) environmental and faunistic properties were observed. In this way alternate scenarios can be easily simulated and new expert judgments can be associated to each 'virtual' record, thus widening the knowledge base upon which FIDESS is built.
Even though at its present development stage FIDESS can be regarded as a very early alpha release of the final tool, it has been tested using an independent data set (»= 69). A confusion matrix, that is, a 5 x 5 contingency table, was obtained by cross-tabulating human expert judgment against FIDESS classification, showing a very good agreement: two out of three cases were correctly classified after defuzzification, while the worst-case error was within a single quality class. A typical measure for interobserver agreement, the weighted Kappa statistics, confirmed that the deviation of the FIDESS classification from a random agreement with expert judgment was highly significant.
Although computationally intensive, an ANN-based DSS cannot be regarded as a paradigm for 'ecological informatics'. An essential component in this light is the 'Graphical User Interface' (GUI) that was wrapped around the ANN to provide a user-friendly and interactive access to FIDESS (Figure 1). The GUI makes the ANN - that is, the unnecessary complexity - absolutely transparent to users who are free to interact with FIDESS. As soon as they modify the input data, changes in classification in real time can be observed. Although it is trivial if compared to the ecological and computational background of FIDESS, the GUI is not a secondary feature. On the contrary, it plays a major role in the acceptance of FIDESS. In fact, while most users are familiar with multi-metric and other biotic indices, they do not feel comfortable with an ANN, which is perceived as a rather obscure 'black box'. Interacting with FIDESS in real time, thanks to a user-friendly GUI, for example, by moving sliders, helps users to learn how FIDESS reacts to changes in biotic and abiotic variables and to understand that FIDESS just mimics their own way of reasoning. The relationships between user's input, ANN, and FIDESS outputs are summarized in Figure 2.
In conclusion, this combination of a typical artificial intelligence technique, a smart knowledge elicitation procedure, and a very user-friendly and interactive GUI can be regarded as a good example of what 'ecological informatics' is all about: combining available methods, data, knowledge, and software into new, viable solutions for ecological problems.
Was this article helpful?