Data Analysis in Vegetation Ecology

Otto Wildi

WSL Swiss Federal Institute for Forest, Snow and Landscape Research, Birmensdorf, Switzerland


This edition first published 2010 © 2010 by John Wiley & Sons, Ltd

Wiley-Blackwell is an imprint of John Wiley & Sons, formed by the merger of Wiley's global Scientific, Technical and Medical business with Blackwell Publishing.

Registered office: John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK Other Editorial Offices:

9600 Garsington Road, Oxford, OX4 2DQ, UK 111 River Street, Hoboken, NJ 07030-5774, USA

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloguing-in-Publication Data

Wildi, Otto.

Data analysis in vegetation ecology / Otto Wildi. p. cm.

1. Plant communities-Data processing. 2. Plant communities-Mathematical models. 3. Plant ecology-Data processing. 4. Plant ecology - Mathematical models. I. Title. QK911.W523 2010 581.70285 -dc22


ISBN: 9780470661017 (HB) 9780470661024 (PB)

A catalogue record for this book is available from the British Library.

Typeset in 10.5/13 Times by Laserwords Private Limited, Chennai, India Printed in Great Britain by TJ International, Padstow, Cornwall.


Preface xi

List of Figures xv

List of Tables xxi

1 Introduction 1

2 Patterns in Vegetation Ecology 5

2.1 Pattern recognition 5

2.2 Interpretation of patterns 9

2.3 Sampling for pattern recognition 11

2.3.1 Getting a sample 11

2.3.2 Organizing the data 14

3 Transformation 17

3.1 Data types 17

3.2 Scalar transformation and the species enigma 19

3.3 Vector transformation 21

3.4 Example: Transformation of plant cover data 23

4 Multivariate Comparison 25

4.1 Resemblance in multivariate space 25

4.2 Geometric approach 27

4.3 Contingency testing 29

4.4 Product moments 30

4.5 The resemblance matrix 32

4.6 Assessing the quality of classifications 33

5 Ordination 35

5.1 Why ordination? 35

5.2 Principal component analysis (PCA) 37

5.3 Principal coordinates analysis (PCOA) 41

5.4 Correspondence analysis (CA) 43

5.5 The horseshoe or arch effect 47

5.5.1 Origin and remedies 47

5.5.2 Comparing DCA, FSPA and NMDS 49

5.6 Ranking by orthogonal components 51

5.6.1 Method 51

5.6.2 A numerical example 53

5.6.3 A sampling design based on RANK (example) 55

6 Classification 59

6.1 Group structures 59

6.2 Linkage clustering 62

6.3 Minimum-variance clustering 64

6.4 Average-linkage clustering: UPGMA, WPGMA, UPGMC and WPGMC 66

6.5 Forming groups 67

6.6 Structured synoptic tables 69

6.6.1 The aim of ordering tables 69

6.6.2 Steps involved 70

6.6.3 Example: Ordering Ellenberg's data 72

7 Joining Ecological Patterns 75

7.1 Pattern and ecological response 75

7.2 Analysis of variance 77

7.2.1 Variance testing 77

7.2.2 Variance ranking 79

7.2.3 How to weight cover abundance (example) 80

7.3 Correlating resemblance matrices 84

7.3.1 The Mantel test 84

7.3.2 Correlograms: Moran's I 86

7.3.3 Spatial dependence: Schlaengglidata revisited 89

7.4 Contingency tables 92

7.5 Constrained ordination 96

8 Static Explanatory Modelling 101

8.1 Predictive or explanatory? 101

8.2 The Bayes probability model 102 8.2.1 The discrete model 104

8.2.2 The continuous model 105

8.3 Predicting wetland vegetation (example) 106

9 Assessing Vegetation Change in Time 111

9.1 Coping with time 111

9.2 Rate of change and trend 112

9.3 Markov models 115

9.4 Space-for-time substitution 122

9.4.1 Principle and method 122

9.4.2 The Swiss National Park succession (example) 125

9.5 Dynamics in pollen diagrams (example) 127

10 Dynamic Modelling 133

10.1 Simulating time processes 135

10.2 Including space processes 141

10.3 Processes in the Swiss National Park (SNP) 142

10.3.1 The temporal model 142

10.3.2 The spatial model 145

10.3.3 Simulation results 146

11 Large Data Sets: Wetland Patterns 151

11.1 Large data sets differ 151

11.2 Phytosociology revisited 153

11.3 Suppressing outliers 156

11.4 Replacing species with new attributes 158

11.5 Large synoptic tables? 162

12 Swiss Forests: A Case Study 169

12.1 Aim of the study 169

12.2 Structure of the data set 170

12.3 Methods 172

12.4 Selected questions 175

12.4.1 Is the similarity pattern discrete or continuous? 175

12.4.2 Is there a scale effect from plot size? 176

12.4.3 Does the vegetation pattern reflect the environmental conditions? 177

12.4.4 Is tree species distribution man-made? 178

12.4.5 Is the tree species pattern expected to change? 184

12.5 Conclusions 184

Appendix A On Using Software 189

A.1 Spreadsheets 189

A.2 Databases 190

A.3 Software for multivariate analysis 191

Appendix B Data Sets Used 193

References 195

Index 205

Plants are so unlike people that it's very difficult for us to appreciate fully their complexity and sophistication.

Michael Pollan, The Botany of Desire


When starting to rearrange my lecture notes I had a 'short introduction to multivariate vegetation analysis' in mind. It ended up as a 'not so short introduction'. The book now summarizes some of the well-known methods used in vegetation ecology. The matter presented is but a small selection of what is available to date. By focusing on methodological issues I try to explain what plant ecologists do, and why they measure and analyse data. Rather than just generating numbers and pretty graphs, the models and methods I discuss are a contribution to the understanding of the state and functioning of the ecosystems analysed. But because researchers are usually driven by their curiosity about the functioning of the systems I successively began to integrate examples encountered in my work. These now occupy a considerable portion of this book. I am convinced that the fascination of research lies in the perception of the real world and its amalgamation in the form of high-quality data with hidden content processed by a variety of methods reflecting our model view of the world. Neither my results nor my conclusions are final. Hoping that the reader will like some of my ideas and perspectives, I encourage them to use and to improve on them. There remains considerable scope for innovation.

The examples presented in this book all come from Central Europe. While this was not intended originally, I became convinced the topics they cover are of general relevance, as similar investigations exist almost everywhere in the world. An example is the pollen data set: pollen profiles offer the unique chance to study vegetation change over millennia. This is the time scale of processes such as climate change and the expansion of the human population. Another, much shorter time series than that of pollen data is found in permanent plot data originating from the Swiss National Park that I had the opportunity to look at. The unique feature of this is that it dates back to the year 1917, when Josias Braun-Banquet personally installed the first wooden poles, which are still in place. Records of the full set of species have been collected ever since in five-year steps. A totally different data set comes from the Swiss Forest Inventory, presented in the last chapter of this book. Whereas many vegetation surveys are merely preferential collections of plot data, this data set is an example of systematic sampling on a grid encompassing huge environmental gradients. It helps to assess which patterns really exist, and whether some of those described in papers or textbooks are real or merely reflect the imagination or preference of researchers scanning the landscape for nice locations. In this case the data set available for answering the question is still moderate in size, but handling of large data sets will eventually be needed in similar contexts. I used the Swiss wetland data set as an example for handling data of much larger size, in this case with n = 17608 releves. Although this is outnumbered by others, it resides on a statistical sampling design.

Some basic knowledge of vegetation ecology might be needed to understand the examples presented in this book. Readers wishing to acquire this are advised to refer, for example, to the comprehensive volumes Vegetation Ecology by Eddy van der Maarel (2005) and Aims and Methods of Vegetation Ecology by Mueller-Dombois and Ellenberg (1974), presently available as a reprint. The structure of my book is influenced by Orlocis (1978) Multivariate Analysis in Vegetation Research, which I explored the first time when proofreading it in 1977. Various applications are found in the books of Gauch (1982), Pielou (1984) and Digby and Kempton (1987) and many multivariate methods used in vegetation ecology are introduced in Jong-man et al. (1995). To study statistical methods used in this book in more detail, I strongly recommend the second edition of Numerical Ecology by Legendre and Legendre, probably the most comprehensive textbook existing today. Several books provide an introduction to the use of statistical packages, which are referred to in the appendix. For many reasons I decided to omit the software issue in the main text; upon the request of several reviewers I added a section to the appendix where I reveal how I calculated my examples and mention programs, program packages and databases.

I would like to express my thanks to all individuals that have contributed to the success of this book. First of all Rachel Wade from Wiley-Blackwell, who strongly supported the efforts to print the manuscript in time and organized all the technical work. I thank Tim West for careful copy-editing, and Robert Hambrook for managing the production process. My colleagues Anita C. Risch and Martin Schütz revised the entire text, providing corrections and suggestions. Meinrad Küchler helped in the computation of several examples. Andre F. Lotter provided the pollen data set. I cannot remember all the people who had an influence on the point of view presented here:

many ideas came from Laszlo Orloci through our long lasting collaboration, others from Madhur Anand, Enrico Feoli, Valerio de Patta Pillar, Janos Podani and Helene Wagner. I particularly thank my family for encouraging me to tackle this work and for their tolerance when I was working at night and on weekends to get it completed.

Otto Wildi Birmensdorf, 1 December 2009

List of figures

2.1 Portrait of Abraham Lincoln. 6

2.2 Vegetation mapping as a method for establishing a pattern. 6

2.3 Ordination of a typical horseshoe-shaped vegetation gradient. 7

2.4 A natural and a man-made event. 9

2.5 Primary production of the vegetation of Europe. 10

2.6 Distribution pattern of oak haplotypes in Switzerland. 11

2.7 The elements of sampling design. 13

2.8 Organization scheme of sample data. 15

3.1 An example of three data types. 18

3.2 Scalar transformation of population size. 20

3.3 Scalar transformation of the coordinates of a graph. 20

3.4 Overlap of two species with Gaussian response. 21

4.1 Presentation of data in the Euclidean space. 26

4.2 Three ways of measuring distance. 28

4.3 The correlation of vector j with vector k. 32

4.4 The average distance of a distance matrix is a perfect measure for homogeneity. 32

4.5 Similarities within and between the forest types of Switzerland. 34

5.1 Three-dimensional representation of similarity relationships. 36

5.2 Main functions of PCA. 37

5.3 Projecting data into ordination space in PCA. 38

5.4 Numerical example of PCA. 39

5.5 Main results of a PCA using real data. 40

5.6 Comparison of PCA and PCOA. 43

5.7 Comparison of CA and PCA. 46

5.8 Performance of two species along a gradient. 47

5.9 The principle of detrending and FSPA. 48

5.10 Comparison of CA and DCA. 50

5.11 Comparison of PCA and NMDS. 50

5.12 Comparison of PCOA and FSPA. 51

5.13 Releves chosen by RANK for permanent investigation. 57

6.1 Two-dimensional group structures. 60

6.2 A dendrogram from agglomerative hierarchical clustering. 62

6.3 Comparing single- (SL), average- (AL) and complete-linkage (CL) clustering. 63

6.4 Variance within and between groups. 64

6.5 Cutting dendrograms at different levels of dissimilarity. 68

6.6 Structuring the meadow data set of Ellenberg. 72

7.1 Distinctness of group structure. 78

7.2 F-values of selected site factors. 83

7.3 Spatial arrangement of measurements (pH) and correlogram. 88

7.4 Projecting distances in one direction. 90

7.5 Evaluating the direction of the floristic gradient. 90

7.6 Correlograms of site factors. 92

7.7 Ordination of group structure in the test data set'nzzm5'. 97

7.8 Comparison of RDA and CCA. 100

8.1 Occurrence probability of three hypothetical vegetation types. 102

8.2 Schematic illustration of a Bayes probability model. 103

8.3 Site suitability computed with the Bayes model. 108

8.4 Occurrence probability of three selected species. 109

8.5 Similarity of field releves and simulated data. 109

9.1 Type of environmental study needed to assess change. 112

9.2 Measuring rate of change in time series of multistate systems. 113

9.3 Ordination of data from eight plots in the Swiss National

Park. 114

9.4 Rate of change in 2 plots, Swiss National Park. 115

9.5 A Markov model of the Lippe et al. (1985) data set. 117

9.6 PCOA ordination of the Lippe succession data. 120

9.7 Markov model of the time series of the Swiss National

Park. 121

9.8 The principle of space-for-time substitution. 122

9.9 The similarity of time series. 124

9.10 Pinus mugo on a former pasture in the Swiss National

Park. 125

9.11 Minimum spanning tree (data from the Swiss National

Park). 126

9.12 Ordering of 59 time series from the Swiss National Park. 126

9.13 Succession in pastures of the Swiss National Park. 127

9.14 Tree species in the pollen diagram from Soppensee (Lotter 1999). 128

9.15 Velocity profile of the Soppensee pollen diagram. 129

9.16 Time trajectory of the Soppensee pollen diagram. 129

9.17 Velocity profiles from quantitative towards qualitative. 130

9.18 Time acceleration trajectory of the Soppensee pollen diagram. 131

10.1 Attempt to get a dynamic model under control (Wildi

1976). 134

10.2 Numerical integration of the exponential growth equation. 136

10.3 Logistic growth of two populations, model 1. 138

10.4 Logistic growth of two populations, model 2. 139

10.5 Logistic growth of two populations, model 4. 140

10.6 The mechanism of spatial exchange. 141

10.7 Overgrowth of a plot by a new guild. 143

10.8 Spatial design of SNP model. 145

10.9 Original (left) and simulated (right) temporal succession. 146

10.10 Results of the spatial simulation of succession, Alp Stabelchod. 148

11.1 Six alliances represented in a random sample of the Swiss wetland vegetation data. 155

11.2 Six alliances represented in a releve sample best fitting the phytosociological classification. 156

11.3 Frequency distribution of nearest-neighbour pairs of releves. 157

11.4 Ordination of a random sample of the Swiss mire vegetation data (left) and the same with outliers removed (right). 158

11.5 Projecting a given sample into a new resemblance space. 159

11.6 Ordination of the wetland sample in the indicator space. 160

11.7 Indicator values superimposed on ordinary ordination. 161


11.8 Similarity matrices of 10 alliances. 163

12.1 Two ordinations of the Swiss forest data set. 174

12.2 Vegetation map (eight groups). 175

12.3 The effect of different plot sizes on similarity pattern. 177

12.4 Vegetation probability map (eight groups). 179

12.5 Distribution of four selected tree species. 181

12.6 Ordination of forest stands. Four selected tree species marked. 182

12.7 Ecograms of forest stands. Four selected tree species marked. 183

12.8 Tree and herb layers of three species in ecological space. 186

List of tables

2.1 Terms used in sampling design. 12

3.1 Effects of different vector transformations. 22

3.2 Numerical example of vector transformation (two vectors). 22

3.3 Transformation of cover-abundance values in phytosociology. 24

4.1 Notations in contingency tables. 29

4.2 Resemblance measures using the notations in Table 4.1. 30

4.3 Product moments. 31

5.1 Notation used in correspondence analysis with examples. 45

5.2 Data set for illustrating the RANK algorithm. 53

5.3 Ranking releves of the 'Schlaenggli' data set. 56

5.4 Ranking species of the'Schlaenggli'data set. 57

6.1 Properties of four popular clustering methods. 67

6.2 Steps involved in sorting synoptic tables. 70

7.1 The structured data set 'nzzm5'. 80

7.2 Variance ranking of species. 81

7.3 Transformations used in the variance-testing example. 82

7.4 Autocorrelation in a one-dimensional gradient. 87

7.5 Computed correlogram. 88

7.6 Mantel test of the site factors analysed in Section 7.2.3. 91

7.7 The structured data set 'nzzm5'. 94

8.1 Notation used in Bayes modelling. 104

8.2 Mean and standard deviation within groups of pH and water level. 107

9.1 Markov process, simulated data. 118

9.2 Numerical example demonstrating a Markov process. 119

10.1 Approximating integration by numerical integration. 136

10.2 Comparison of models 2 and 3. 140

10.3 Initial values in the times-series data. 147

10.4 Six discrete vegetation states (Figure 10.8) used as initial conditions. 148

11.1 Synoptic table of a random sample of the Swiss mire vegetation data. 164

11.2 Synoptic table of a selective sample of the Swiss mire vegetation data. 166

12.1 Composition of eight vegetation types. 173

12.2 F-values of site factors based on eight forest vegetation types. 180

12.3 Number of plots where selected tree species occur. 185

Was this article helpful?

0 0

Post a comment