Development of the methods for mapping forest cover using satellite images is an important support of the national forest inventory of Ukraine. The objective of the research was to investigate the variable selection procedure in a context of forest mask creation for the territory of lowland plains of Ukraine using Landsat 8 OLI seasonal composited mosaics and Random Forest (RF) algorithm. The study is based on reference dataset that includes more than 4700 sampling points. For visual interpretation of each sampling point, we used free images of very high spatial resolution available from Google and Bing Maps services. All images were filtered for the study area and combined in the form of four seasonal composited mosaics for the next periods: year, summer, autumn and April-October as it was described by (Roy et all., 2010). The accuracy of the classification models were tested by means of OBB (out-of-bag) error and variable importance were analyzed using %IncMSE provided by the randomForest algorithm for R. Using these measures, the conclusion was drawn that classification which incorporates all four mosaics had the higher accuracy followed by classification of April-October mosaics. The error obtained during classification of other mosaics were significantly higher. It also was found that inclusion of latitude and longitude in the list of predictors tends to increase the accuracy of classification of landcover types for the study territory. We used forward stepwise selection algorithm to analyze the accuracy of classification with different number of predictors (from 2 to 53). Finally, we concluded that the most accurate is the classification which incorporates 36 most important variables including longitude and latitude.
Landsat 8 OLI, seasonal composited mosaics, forest mask, Random Forest