q i 2 ^ t q i jj 13 In order to compare the RFP algorithm with KNN and Rules learning algorithms, we used abalone, auto-mpg, buying, country, cpu, electric, »are, housing, read and servo real world datasets for function approximation (available at http://funapp.cs.bilkent.edu.tr [11]). The information about the number of instances, number and type of features and presence of missing values are

and IB3 got 96:7% accuracy (5 errors). Both IB3 and the neural networks have parameters to set, while tubular neighbors makes an automatic choice based on cross-validation. 7.3 Servo data This data set is from the Irvine repository. The response is the rise time for a servo mechanism. There are two integer valued predictors taking 4 and 5 consecutive levels and two categorical predictors each

considered. Dataset Housing Cpu Prices Mpg Servo Ozone Number of examples 506 209 159 392 167 330 Number of regressors 13 6 16 7 8 8 where ! i are weights than can be conveniently used to discount each error according

a sequence of 100 networks was trained using different values of ° for each hidden node. For the auto-mpg, servo and Tecator data sets (3 hidden nodes) the ° values (0:5; 1:5; 2:5) were used, for the glass data set (6 hidden nodes), the values (0:5; 1:0; 1:5; 2:0; 2:5; 3:0) were used, and for the bodyfat data set (7 hidden nodes)

Multilayer perceptrons behave similarly, as shown in figure 4, as confirmed by experiments performed with the Solar, Wine, Glass and Servo data sets. The most important difference with high order perceptrons is that the networks do not or only very slowly converge for weight variances close to zero. Such variances should therefore not be used

itself has a large influence on the optimal initial weight variance: for the solar, wine, and servo data sets, the networks have about the same size for the same order, but the optimal value for the weight variance differs a lot for the network with the logistic 11 0.01 0.1 1 10E-4 0.001 10E-5 10E-6

itself has a large influence on the optimal initial weight variance: for the solar, wine, and servo data sets, the networks have about the same size for the same order, but the optimal value for the weight variance differs a lot for the network with the logistic activation function. Further, the optimal

GASEN's generalization error is significantly lower than that of the simple ensemble method, and e-GASEN attains still lower generalization errors than GASEN. On the Servo data set, GASEN is slightly inferior to simple ensemble. The e-GASEN method's performance, however, has no significant difference with that of the simple ensemble method. From the aforementioned statistics

are AutoMpg, AutoPrice, Housing, MachineCpu and Servo The other three data sets are from dynamic domains where QUIN has typically been applied so far [Suc, 2003; Suc and Bratko, 2002] . It should be noted that in these domains the primary objective was to explain the