Flood Mapping in Vegetated Areas Using an Unsupervised Clustering Approach on Sentinel-1 and -2 Imagery

: The European Space Agency’s Sentinel-1 constellation provides timely and freely available dual-polarized C-band Synthetic Aperture Radar (SAR) imagery. The launch of these and other SAR sensors has boosted the field of SAR-based flood mapping. However, flood mapping in vegetated areas remains a topic under investigation, as backscatter is the result of a complex mixture of backscattering mechanisms and strongly depends on the wave and vegetation characteristics. In this paper, we present an unsupervised object-based clustering framework capable of mapping ﬂooding in the presence and absence of ﬂooded vegetation based on freely and globally available data only. Based on a SAR image pair, the region of interest is segmented into objects, which are converted to a SAR-optical feature space and clustered using K-means. These clusters are then classiﬁed based on automatically determined thresholds, and the resulting classiﬁcation is reﬁned by means of several region growing post-processing steps. The ﬁnal outcome discriminates between dry land, permanent water, open ﬂooding, and ﬂooded vegetation. Forested areas, which might hide ﬂooding, are indicated as well. The framework is presented based on four case studies, of which two contain ﬂooded vegetation. For the optimal parameter combination, three-class F1 scores between 0.76 and 0.91 are obtained depending on the case, and the pixel- and object-based thresholding benchmarks are outperformed. Furthermore, this framework allows an easy integration of additional data sources when these become available.


Introduction
Floods have been, and continue to be, the most occurring of all natural disasters, causing substantial human and economic losses [1,2]. Moreover, their frequency, intensity, and impacts are expected to further increase due to climate change [3]. Insights into the occurrence and dynamics of floods are thus of paramount importance, as they contribute to emergency relief, damage assessment, and flood forecast improvement [4][5][6]. Spaceborne satellites have evolved into the preferred source of flood observations due to their synoptic view and near real-time availability. In contrast to optical sensors, Synthetic Aperture Radar (SAR) sensors allow for observations during both day and night as well as under cloudy conditions. Furthermore, water surfaces are generally clearly distinguishable from the surrounding land due to their smooth character. Indeed, SAR sensors send out microwaves to the Earth's surface and measure the returned signal or backscatter, which depends on the roughness, structure and dielectric properties of the surface as well as on the properties of the incoming wave [7]. when considering the HV/HH polarization ratio, the separability between these classes increased to excellent. Furthermore, Tsyganskaya et al. [19] successfully made use of a polarization ratio to identify FV.
A broad overview of FV classification approaches, within the context of both flood and wetland monitoring, is provided by Tsyganskaya et al. [29]. The majority of these is supervised, which limits their transferability and applicability for near real-time flood monitoring. Furthermore, most of them consider single scene backscatter intensity only. While some have demonstrated the use of interferometric coherence, both for X- [43,44] and L-band [45], the application of this information source remains uncommon. Moreover, the use of polarimetric parameters remains rather limited, despite promising results. For example, Brisco et al. [46] demonstrated the inclusion of polarimetric decomposition bands in a curvelet-based change detection framework. Plank et al. [47] combined polarimetric parameters of both quad-polarized L-band ALOS-2/PALSAR-2 and dual-polarized Sentinel-1 C-band imagery in a Wishart classification framework. This framework is unsupervised but requires the manual labeling of meaningful classes. Mainly the limited availability of quad-polarized SAR imagery seems to hamper the application of polarimetry based approaches. Based on Sentinel-1 intensity data only, Tsyganskaya et al. [19] presented a supervised time series-based approach. Depending on the case, the normalized VV intensity or VV/VH ratio were found to be the most important features [19,48]. In a recent proof-of-concept, Refice et al. [40] have successfully combined multi-temporal C-and L-band imagery to monitor flooding in a remote vegetated area, by using a clustering-based approach. Olthof and Tolszczuk-Leclerc [49] developed a supervised machine learning approach based on dual-pol RADARSAT-2 imagery, using automatically extracted training data based on a Landsat inundation frequency product to map open water and using region growing to include flooded vegetation. Based on X-band data, Pierdicca et al. [50] presented an object-based region-growing approach capable of detecting flooded narrow-leaf crops, while Grimaldi et al. [51] applied a single scene based approach making use of probability binning and historic flood information. However, despite these and other considerable contributions, a transferable, unsupervised framework for flood mapping in vegetated areas based on freely available data only is still lacking.
In this study, we present an unsupervised, object-based clustering framework for flood mapping. This framework makes use of freely and globally available data only, is capable of easily fusing different data sources and does not require training data. Based on dual-polarized SAR and optical data, a classification into four classes, i.e., dry land, permanent water, open flood, and flooded vegetation, is made. Moreover, forested areas possibly hiding flooding are indicated. In the remainder of this paper, the full processing chain is described in detail. Finally, results for four case studies are described and discussed.

Study Cases
In order to illustrate the accuracy and robustness of the presented methodology, results are presented based on four study cases with varying characteristics, both in terms of flood type and vegetation cover: the Sava River in Croatia, the White Volta River in Ghana, and River Fergus and Shannon in Ireland. Of these, only the former two contain flooded vegetation areas. The location of these areas is shown in Figure 1, while descriptions are given below.
The Lonsjko Polje Nature Reserve is a large wetland area located along the Sava River in Croatia. The area is used as a retention basin to protect the surroundings from flooding in case of high discharge. Therefore, it experiences regular and long-lasting floods. These occur especially in late spring, due to snowmelt, but also in autumn and winter, due to intensive rain. In 2019, the area was flooded between early May and late June. Reference data were constructed based on a cloud-free Sentinel-2 image capturing the flood on 7 June 2019, a 0.5 m resolution aerial imagery covering the 2016 winter floods, and hydraulic knowledge (water level time series and levee location). Forest is the dominating vegetation type, next to some wet grasslands and agricultural fields along the river and surrounding the rural settlements. A considerable fraction of the flooding is situated in the lowland forests. Given the canopy density during summer, this part of the flooding is expected to be invisible on C-band SAR imagery. The White Volta is part of the main river system in Ghana. It emerges in Burkina Faso and discharges into Lake Volta. In August 2018, communities in Northern and Upper East regions of Ghana were affected by heavy and continuous seasonal rainfall. Additionally, excess water from the Bagre Dam, located in Burkina Faso, was spilled from the 31st of August until the 10th of September. The combination of these two events caused unprecedented flooding in many local communities along the White Volta and continued throughout September [52]. Along the river banks, mainly shrubs and herbaceous vegetation occur. Several patches of open forest can be found too. Furthermore, the region of interest (ROI) also comprises several settlements and a considerable amount of agriculture. Reference data were constructed based on a Sentinel-2 image acquired on September 19. As this image is not fully cloud-free, six cloud-free subsets were selected. These are also indicated on Figure 1. Several patches of flooded vegetation were identified.
In the winter of 2015-2016, the passage of Storm Desmond led to exceptionally high amounts of rainfall in the UK and Ireland. In Ireland, this led to groundwater flooding that lasted for several months [53]. The Copernicus Emergency Management Service (EMS) was activated and the flooding was mapped for several areas [54]. Two areas were considered in this study, i.e., the surroundings of Ennis and Corofin along Rivers Fergus, and the surroundings of Ballinasloe and Portumna along River Shannon. The former was delineated by the EMS based on a COSMO-SkyMed image acquired on 16 December 2015, while the latter was delineated based on a COSMO-SkyMed image of 9 January 2016. These mappings were used as means of reference. The ROI along River Fergus comprises the city of Ennis, agriculture zones and several lakes. The vegetation is mainly herbaceous but several forest patches can be found too. The ROI along River Shannon is less forested, although several smaller patches occur along the river. Moreover, here, mainly herbaceous vegetation occurs alongside agriculture and several peat bogs. Based on the reference data, no flooded vegetation is expected.

Data
The classification framework makes use of globally available, open source data sets only. These include Sentinel-1 imagery, Sentinel-2 imagery or the Copernicus Global Land Service (CGLS) land cover product [55], and the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM). An overview of the acquisition dates or time ranges of the data sources is provided in Table 1. For each study case, a Sentinel-1 image pair, i.e., an image acquired before and one during the flood, was selected. Sentinel-1 provides C-band SAR imagery in two polarizations, i.e., VV and VH, with a spatial resolution of 10 m and a repeat frequency of six to twelve days depending on the location. In this study, the Level-1 Ground Range Detected product of Interferometric Wide swath data was used. In order to avoid distortions due to differences in viewing angle, images of the same relative orbit were selected. This selection was done manually in the present study, but could be automated too [56]. For each image, the precise orbit file was applied, thermal and border noise were removed, and a radiometric calibration to sigma0 and terrain correction were applied. Next, the images were co-registered and speckle filtered using the Lee Sigma filter (7 × 7 window). These preprocessing steps were performed with ESA's SNAP software.
In order to include information on vegetation cover, two data sources are compared. First, Sentinel-2 imagery can provide insights on the vegetation state around the time of flooding. However, clouds often hamper optical observations, especially in rainy periods. If no cloud-free image can be found within the 3 month time range before the start of the flood, a cloud-free composite (CFC) over this time range is calculated. This CFC is obtained based on the approach of Simonetti et al. [57], by calculating the median of the image stack after cloud masking based on a classification tree. Within this study, the Level-2A product, i.e., 10 m resolution Bottom Of Atmosphere (BOA) reflectance, is used. Second, land cover (LC) products can provide a more general view on the vegetation state. They are often derived from longer image time series and the data quality thus does not depend on the presence of clouds. However, these products often have a coarser resolution and, in case of strong seasonality, the current state of the vegetation cannot be deduced. Due to its global availability, the CGLS land cover product was selected [55]. It has a resolution of 100 m and provides a discrete land cover map comprising 23 classes as well as class fractions for the ten base classes: bare ground and sparse vegetation, moss and lichens, herbaceous vegetation, shrubland, cropland, forest, built-up, snow and ice, permanent and seasonal water).
Finally, a DEM over each ROI is considered. The hydrologically conditioned elevation, derived from the SRTM and included in the HYDROSHEDS product [58] is used in this study.

Methods
The unsupervised clustering framework is summarized in Figure 2 and comprises several steps, i.e., segmentation of the image into objects, extraction of the object feature space (FS), K-means clustering and classification, and post-processing refinement. In this section, each of these steps is described in detail.

Image Segmentation Using the Quickshift Algorithm
First, image pixels are grouped into object preliminaries using the quickshift algorithm. Quickshift is a further elaboration of mean shift, a density estimation based algorithm that finds clusters by assigning data points to nearby density modes [59]. It is based on the assumption that the feature space represents an empirical probability density function of the represented parameter, and clusters (or segments) are thus represented by dense regions. Whereas mean shift makes use of gradient descent for mode seeking, quickshift uses a kernel-based approximation. First, local density is estimated using a Gaussian kernel, with a size defined by the parameter ks. Given a kernel K, the density of a pixel p is calculated as follows, where I is the image array; r and c are the row and column indices, respectively; and px is the center pixel. Both spatial and image color distance are thus considered for the density calculation. Next, each pixel is assigned to its nearest neighbor with a higher density. The parameter dist max controls the maximal size of the resulting objects. All pixels that are more then dist max away from the nearest pixel with a higher density, are assigned as a cluster (or segment) seed.
The quickshift algorithm is applied on a feature space comprising four bands, i.e., the VV and VH bands of the SAR image pair. The two algorithm parameters ks and dist max were determined empirically. In order to capture small scale features, these parameters are set to resp. a 7 × 7 window and a value of 4.
Next, the resulting object preliminaries are refined iteratively. For each object, the object mean difference with each of its neighbors is calculated for the four input bands. If for all four bands, this difference is below a precalculated threshold, the objects are merged. The threshold is determined as the minimum of the standard deviations of the considered object and neighbor for that specific band: Furthermore, in order to prevent the objects from becoming too irregularly shaped, a shape constraint is added. Objects are merged only if the perimeter over rooted area is below 12. This threshold was set on a trial-and-error basis. The object refinement constraints can thus be summarized as follows.

Object-Based Clustering
In order to detect the intrinsic structure of the feature space, object-based clustering is applied. In a first phase, several clustering algorithms, including K-means, spectral clustering, and quickshift clustering, were tested. However, given its computational speed, robustness, and good results, the K-means algorithm was withheld.
K-means is an iterative clustering algorithm that aims to minimize the within-cluster sum-of-squares [60]. For a predefined number of clusters k, the cluster centroids are first randomly set. Each sample is then assigned to the nearest centroid. Lastly, updated cluster centroids are obtained by taking the mean of the samples in that cluster. The latter two steps are repeated until a (local) minimum of the within-cluster sum-of-squares is found. In order to circumvent the issue of local optima, the procedure is repeated for different centroid initializations. K-means can result in a sub-optimal cluster partitioning when the clusters are of differing size or density, or have non-globular shapes. In unbalanced datasets, K-means tends to split up the larger clusters. This issue can be overcome by over-clustering the dataset and combining the subclusters of larger clusters in a later phase. In order to determine the optimal value of k and illustrate the effect of a varying k, all values between 2 and 15 were tested and compared.
In order to include information on both the flooding and the vegetation state, K-means is run on a feature space consisting of both SAR and optical or land cover data. Several feature spaces are tested. With respect to the SAR features, the VV and VH band of the reference and flood image can be complemented with the ratio features, where ratio stands for the ratio in linear scale or difference in log scale (dB) between the VV and VH polarization, and/or the increase features, i.e., the difference in VV/VH/ratio between the reference and flood image. The vegetation state can be represented by the discrete land cover class, the nine land cover fractions or a selection of three or ten optical bands. The combination of these features results in 15 feature spaces, i.e., SAR, SARlc, SARlcfrac, SARo3, SARopt, SARincF, SARincFlc, SARincFlcfrac, SARincFo3, SARincFopt, SARwC, SARwClc, SARwClcfrac, SARwCo3, and SARwCopt. These feature spaces consist out of different SAR and optical subspaces, which are summarized in Table 2. For example, the feature space SARwCopt consists of the subspaces SAR, wC and opt, or the following bands: VV ref , VH ref , VV flood , VH flood , R ref , R flood , VV inc , VH inc , R inc , B 2 , B 3 , B 4 , B 5 , B 6 , B 7 , B 8 , B 8A , B 11 and B 12 . The feature space is obtained by calculating object means for continuous bands and modes for discrete bands. Also the object standard deviations are calculated for all bands. Each feature is scaled to zero mean and unit variance.

Cluster Classification
The final classification aims to discriminate dry land (DL), permanent water (PW), open flooding (OF), and flooded vegetation (FV) if present. Given that K-means is not capable to correctly separate clusters of differing sizes, over-clustering is targeted and k will be set to a value above 2. This implies that the resulting clusters still need to be classified into one of the above-mentioned classes. Clusters are classified based on their centroid. For the PW and OF classes, the VV and VH bands are considered and the thresholds are obtained by means of tiled pixel-based thresholding using the KI algorithm, as suggested by [12]. The PW and OF class are defined as follows.
As mentioned before, flooding beneath vegetation is expected to increase backscatter intensity due to enhanced double-bounce backscattering. As VV is more sensitive to double-bounce, this increase is expected to be more pronounced in the VV band as compared to the VH band. Thus, the FV class is characterized by an increase in both VV inc and R inc . The thresholds are both set to a value of 3, based on a calibration as well as reported values in literature [19,36,40]. The FV class is defined as

Post-Processing Refinement
The initial classification is further refined using contextual information. Several studies have underlined the added value of classifying pixels or objects not only based on their spectral properties, but also on their surroundings [16,47,61]. Context is integrated here by means of a region growing (RG) approach, which iteratively adds neighboring objects that satisfy a preset condition to the seed region. First, the PW, OF, and FV classes are refined. For PW and FV, the growing condition is the same as the classification rule (cf. Section 3.3). For OF, the object mean should be below the threshold in only one of the polarizations instead of the two. This way, objects that exhibit a slightly increased backscatter in one of both polarizations due to e.g., wind roughening are included too. Table 3 summarizes the growing conditions as well as the seed and source classes for PW, OF and FV. For FV for example, only DL objects adjacent to OF or FV objects are considered.
Next, low elevation areas adjacent to the flood are considered. Densely vegetated areas can be falsely classified as DL due to limited penetration, but can be included based on hydrologic considerations. Objects of which more than 50% of the neighbors are flooded and whose elevation is lower than the minimum elevation of the flooded neighbors, are included in the OF class using a RG approach. Then, a minimal mapping unit (MMU) of 10 pixels (1000 m 2 ) is applied. This parameter value was set empirically and is in line with reported values [47,62]. Finally, densely forested areas might hide flooding due to limited penetration. Therefore, objects whose land cover is closed forest are flagged as forested areas (FA). Table 3. RG parameters for the PW, OF, and FV classes.

Accuracy Assessment
The accuracy of the presented approach is assessed primarily by the F1 score, the harmonic mean of Precision and Recall. Similarly as the Critical Success Index (CSI), this measure does not consider the correctly classified dry land pixels (true negatives in a binary classification) which cause an overestimation of the accuracy [63]. The F1 score is preferred here given its higher popularity for classifications problems in general. It is calculated based on the number of true positives (TP), false positives (FP) and false negatives (FN), which can be retrieved from the contingency matrix.
In order to get an idea of over-and underestimations, the Precision and Recall are also calculated separately. The respective equations are Recall = TP TP + FN As the resulting classification comprises four classes, i.e., DL, PW, OF, and FV, the accuracy metrics cannot be calculated in a binary way. Therefore, a multi-class accuracy is determined as the unweighted average of the accuracies obtained for each class separately. This way, the different classes have an equal influence on the resulting accuracy and a bias due to a dominating class (often DL) is avoided.
Moreover, as FV is not present in all ROIs, OF and FV are considered as one class. Thus, a 3-class F1 score is obtained based on the DL, PW, and OF + FV classes. In order to allow comparison with other studies, who typically result in a single scene or change flood map, the single (PW + OF + FV vs. DL) and change (OF + FV vs. DL + PW) accuracies are calculated too. The results are furthermore compared to a benchmark, i.e., the classifications resulting from object-and pixel-based thresholding. The thresholds were obtained by automated tile-based thresholding using the KI algorithm [12], while the PW and OF class definitions were the same as in Equation (4).

Separability of Flooded Vegetation
In order to know whether the flooded vegetation present in the ROI can be mapped at all, it is important to get an idea of the separability between this class and others. Previous studies indicated the separability between FV and OF is generally good, while significant confusion can occur between the FV and DL classes [41,42]. Figure 3 shows the class distributions of FV and DL across the SAR features for the Sava ROI and subset 3 of the Volta ROI. In the Sava ROI, deciduous forest is the predominant type of FV. As the flooding occurred in summer, limited penetration is expected. This can also be seen in Figure 3, as there is a strong overlap and similarity between the two classes for all SAR features. The flooded vegetation in Volta-3 is a mixture of shallow water, shrubs, and grassland. The resulting backscatter shows a more pronounced shift from the DL class for several SAR features. At the time of flooding, VH shows a shift towards lower backscatter values, while VV is shifted towards both higher and lower backscatter values. This observation confirms the assumption that VV is more sensitive to double-bounce, and FV thus leads to increased backscatter values. VH is less sensitive to both double-bounce and roughening of specular surfaces, resulting in a drop due to more specular reflection rather than an increase due to double-bounce. The VV inc and VH inc features show similar shifts, while the R inc feature shows a clear increase, explained by the differences in backscatter response between VV and VH. However, a considerable overlap between the DL and FV classes remains in all features, so classifying FV by means of single feature thresholding is not possible.

K-Means Cluster Classification
K-means clustering and cluster classification (CC) were applied on several feature spaces for a range of k values. The resulting three-class F1 scores as well as the F1 scores for the pixel-and object-based thresholding benchmark are visualized in the left column of Figure 4. For all ROIs, several FS/k combinations outperform the benchmarks. However, significant differences between the FS/k combinations exist and several trends can be identified. First of all, k values below 5 lead to poor results for all cases. As K-means has difficulties identifying clusters of varying sizes and the DL class is significantly larger than the others, over-clustering is necessary. Moreover, the optimal k value increases with an increasing number of features. The structure of the feature space indeed becomes more complex when the number of dimensions increases, and more clusters are needed to capture the full structure. Despite their differences in land cover, no significant differences concerning the optimal k value could be detected across the ROIs. Considering the SAR features only, the benchmarks are outperformed for several values of k for all ROIs. As the same thresholds were used to classify the clusters, objects, and pixels, this improved accuracy underlines the added value of first grouping similar objects in clusters based on the inherent structure of the data. By doing so, objects that do not satisfy the PW or OF class definition but are similar to a group of objects that does, are nevertheless classified into this class. Therefore, issues like water roughening are circumvented. However, the accuracy for the SAR FS across k values is rather inconsistent. Other FSs lead to higher accuracies, although the optimal FS seems to depend on the ROI and the outcome is quite sensitive to the choice of k. The inclusion of the land cover fractions leads to poor results across all ROIs, due to the skewness of these features, which typically contain a lot of zero values. This feature subspace was thus found unsuited for K-means clustering without transformation, but the results were included here for completeness. In general, the SARincF and SARwC FSs complemented by three or ten optical bands perform well and the optimal k lies around 10. This k value is in line with the findings of Refice et al. [40], who initially opted for a higher number of clusters but obtained 12 distinct clusters after merging.
In order to include information on texture too, some tests were done considering both the object means and standard deviations (results not shown). However, the inclusion of standard deviation features did not improve the resulting accuracies, on the contrary. Due to the image segmentation and object refinement as well as the speckle filtering, standard deviation and texture in general was found significantly less descriptive than the mean.
It is important to note that across all ROIs, FSs, and k values, only one cluster was classified as FV, i.e., for SARwC and k = 15 on the Volta ROI. The FV areas in both the Sava and Volta ROI were primarily classified as DL. In the former, this confusion can be attributed to the limited penetration of the C-band SAR signal. In the latter, several objects but no clusters-except for one-satisfy the FV class definition. The detected cluster (SARwC-15) has a mean VV inc of 3.1 and a mean R inc of 4.1. It exists mainly out of FV objects but contains 17% DL objects too. For lower values of T VVinc and T Rinc , more FV clusters would be detected but all of these would lead to a considerable amount of FPs due to confusion with the DL class. The threshold values of 3 were chosen based on the considered cases as well as values reported in literature. However, these could need some refinement based on insights brought by additional cases.

Post-Processing Refinement
The post-processing procedure comprises several steps, i.e., the refinement of the OF, PW, and FV classes using region growing, the inclusion of low elevation areas using DEM-based region growing, the removal of small flood objects, and the indication of forested areas. The three-class F1 scores obtained after applying the RG and MMU steps on all FS/k combinations are shown in the right column of Figure 4. As can be seen when comparing the two columns in this figure, the post-processing procedure improves both the accuracy and the robustness of the methodology. Especially the sensitivity to the choice of k is decreased but also the differences amongst FSs are reduced. Across all ROIs, the best results are obtained by SARwCopt with 10 clusters. However, considering the SAR FS, a three-class F1 of only 0.016 less (0.8368 compared to 0.8532) is obtained. The added value of the optical features is thus rather limited, although improvements can be obtained depending on the case. The improvement compared to the cluster classification is most pronounced for the Sava and Volta ROIs, while differences are marginal for the Fergus and Shannon ROIs. However, the initial accuracy for the latter is already high, so only marginal improvements can be obtained. Moreover, these ROIs do not contain flooded vegetation areas. Figure 5 shows the evolution of the three-class F1 score throughout the different processing steps, including forest flagging, for SAR-10 and SARwCopt-10. As can be seen here, the impact of these steps strongly depends on the ROI as well as the FS/k combination. In general, mainly RG PW and RG OF alter the classification outcome. RG FV significantly altered the classification outcome for Volta too, although the resulting change in accuracy is limited due to the small size of this class. Forest flagging also significantly increases the accuracy for all ROIs. The shown values are the accuracies calculated over the non-forested areas only. The increase is most pronounced for the Sava ROI due to the high abundance of forests in this region. The high accuracy mostly indicates that the visible flooding is mapped highly accurately and underlines the incapability of Sentinel-1 to map flooding under dense forest canopies. The limited impact of RG DEM is not unexpected, given the coarse resolution (both horizontally and vertically) of the SRTM DEM product. Especially in forested areas, the reliability of this product can be poor [64]. A finer resolution DEM could substantially increase the added value of this step.
For each PP step, several parameter values were compared. For example, the outcome of RG OF into regions below the threshold in VV and VH vs. in VV or VH was compared. The latter lead to minimal differences for Fergus, Shannon, and Sava but increased the accuracy for Volta with up to 0.10. RG DEM only has a negligible impact on the outcome. Relaxing the growing condition to the mean or maximum of neighboring heights increased the accuracy for the Volta ROI but significantly lowered the accuracy for other cases. On the other hand, the outcome proved insensitive to the fraction of flooded neighbors required. For RG FV too, several values of T VVinc and T Rinc were tested. Again, a value of 3 resulted in the best trade-off of FPs and FNs, but some refinement based on additional cases might be beneficial. Furthermore, the chosen MMU value provided the best trade-off between FPs and FNs across all ROIs.

Final Flood Maps
Contingency maps for the final classifications are shown in Figure 6. In general, most of the water surfaces are mapped correctly. In the Sava ROI, a high number of false negatives (FN) occurs due to dense forest canopies, hampering the SAR signal penetration. As can be seen, almost the entire floodplain is forested. More FNs occur in the Volta subsets, especially at the flood edges. In the latter, quite some within-flood confusion (WFC) occurs too. These are areas that were mapped as PW, OF or FV but belonged to another of those three according to the reference data. These WFCs are mainly confusions between FV and OF, which can be explained by the fact that the flooded vegetation in this ROI is a mixture of shallow water and flooded vegetation and thus the backscatter signature is a mix too. For both the FNs in the Sava ROI and the WFCs in the Volta ROI, the inclusion of additional SAR features like L-band imagery could be beneficial. In order to allow comparison with other studies, Table 4 summarizes the F1 scores when considering the flood classification as a three-class, single scene or change detection problem. The single and change F1 scores exceed the three-class value for all ROIs except for the Sava area, where the DL accuracy is significantly higher than the PW and OF values. Although these values are encouraging, they should be interpreted with care as they were calculated on relatively small reference subsets. This prevents the accuracy from being too heavily biased by the DL class, but does not allow to thoroughly test the algorithm for FP detection in the context of automated flood monitoring. Within the Volta ROI, one subset was selected in a flood-free area and did not result in erroneous flood detection. However, ideally, the approach would also be applied on a non-flood image pair in order to test whether seasonal backscatter changes are not erroneously picked up as flooding.

Limitations and Future Improvements
Although the results presented in this paper illustrate the potential of the developed framework, several aspects could be improved in the future.
First of all, the considered feature space could be further extended and improved. As some confusion occurred between the OF and FV classes in the Volta ROI and the flooded forests in the Sava ROI could not be detected, the inclusion of additional SAR features could further refine the flood class definitions and help overcome these issues. Olthof and Rainville [65] and Plank et al. [47] illustrated the potential of including multiple polarizations and derived polarimetric parameters, as well as SAR imagery acquired using different wavelengths. Especially L-band imagery could significantly contribute to the framework's capability of mapping flooding in forested areas [31,33,40]. Within the context of open source data, the upcoming NISAR mission, which is expected to be launched in 2022 and will provide L-and S-band SAR imagery, is promising [66]. Moreover, ESA is planning to include an L-band satellite, named ROSE-L, in its Copernicus program. It is part of the six high-priority candidate missions currently being studied [67]. As the Sentinel constellation provides systematic imagery with short revisit times, the inclusion of time series information could improve the classification outcome too [19]. As such, the sensitivity to the choice of the reference image could be decreased, though anomaly detection is less suited for persistent flooding. The inclusion of time series could furthermore contribute to the detection of thematic clusters as long as the weight of the flood features is maintained.
Second, the robustness of the classification thresholds could be further increased. Currently, the FV classification thresholds were set as fixed values. For the considered cases, these values provided a good trade-off between over-and underestimation. However, more cases are needed to check the general applicability of these values. Besides good reference data, a thorough investigation of the impact of incidence angle and vegetation type on the resulting backscatter for different wavelengths and polarizations, based on an experimental set-up, would provide invaluable information for the future improvement of FV classification algorithms. Furthermore, future work could investigate whether the selection of the FV threshold values could be automated, similar as is the case for the OF and PW thresholds. Moreover, the incidence angle dependency of both classification thresholds is a topic for further investigation.
Third, some modifications can be made to further upscale the presented framework. Given the global availability of the input data, the framework's independence on training data and the good results for floods with varying characteristics, it has substantial potential for automated near-real time flood monitoring. However, the presented results were obtained on relatively small (around 1000 km 2 for Fergus/Shannon/Sava, 16,000 km 2 for Volta), manually delineated subsets with a significant fraction of flooding. On full scenes, the fraction of flooding is usually significantly lower and the detection of a flood cluster can be hampered, similar as with global thresholding. Moreover, the computation time, which was 39 minutes for the Fergus ROI on an Intel E5-2660v3 (Haswell-EP @ 2.6 GHz) computation node using 4 cores and 16 GB of RAM, is expected to significantly increase with increasing image sizes. However, both issues could be overcome by detecting the core area of flooding prior to applying the clustering framework. This could be done by applying a split-based approach, as was suggested by Chini et al. [68]. Last, confusion between urban and flooded vegetation areas did not occur in the considered ROIs. However, given the similar backscatter behavior of these thematic classes, confusion could occur in other regions. This could be prevented by masking out urban areas, similar as was done for forested areas, e.g., based on Global Urban Footprint [69].

Conclusions
In this study, an unsupervised clustering-based approach that can be used for automated, near real-time flood mapping in vegetated areas based on freely available data is presented. After image segmentation, K-means clustering is applied on an object-based feature space consisting of SAR and optical features. The resulting clusters are classified based on their centroids. Finally, the classification is refined by a region growing post-processing refinement. The final outcome discriminates between dry land, permanent water, open flooding and flooded vegetation, while forested areas which might hide flooding are indicated too. Results are presented based on four case studies, of which two contain areas of flooded vegetation. The results for the ROIs without flooded vegetation illustrate the added value of the clustering approach, as the pixel-and object-based benchmarks are outperformed by the clustering framework even when considering the same SAR features. Although good results are obtained based on the SAR bands only, additional SAR and optical features lead to further improvements. Moreover, the post-processing refinement significantly increases the accuracy of the methodology and decreases the sensitivity to the parameter choice. Across all ROIs, the best result was obtained based on the SARwCopt FS using 10 clusters. For the Sava, Volta, Fergus, and Shannon ROIs, three-class F1 scores of 0.7648, 0.8588, 0.8793, and 0.9098, respectively are obtained. The detection of flooding beneath dense forest canopies was not possible using C-band SAR and optical features only. Flooding in less densely vegetated areas could be mapped successfully, although significant confusion with the open flood class occurred. However, the clustering framework allows to easily integrate additional features. For example, the inclusion of L-band SAR imagery could substantially improve the FV classification accuracy. The code of this object-based flood mapping framework is available through https://github.com/h-cel/OBIAflood. Author Contributions: L.L. developed the methodology, carried out the analysis and interpretation, and wrote the paper. N.E.C.V. and F.M.B.V.C. contributed to the critical review of the methodology as well as the content, structure and language of the paper. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The authors would like to thank ESA's Copernicus program for providing the Sentinel-1 and -2 imagery, the CGLS land cover product and the EMS mapping products. We would furthermore like to thank the World Wide Fund for Nature for providing the HydroSHEDS product.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript.