Site-Specific Soil Sampling


by T.G. Mueller

BAE 599


Introduction

To make vary management across the landscape, you need to know what soil properties are across agricultural fields. Site-specific soil sampling is often used for variable rate fertilizer P and K application and liming. It is economically not possible to use site-specific soil sampling for within-field variable rate N application because of the ephemeral nature of N in the soil. Throughout the class we will discuss alternative methods for managing N spatially.  There are many ways to soil sampling site-specifically. We will describe a few.

Zone sampling involves defining management zones within fields. A sample is collected from each zone and sent to the laboratory for analysis. The number and geometry of the sub-samples that are composted for each sample vary. Information for used to identify sampling zones include remote sensed imagery, yield maps, soil survey maps, topographic information, historical management zones, ground based sensors, and arbitrary regular blocking (similar to grids except entire blocks are sampled not grid points). Sometimes this is used to delineate zones by eye. Other times, analytical techniques are used to delineate zones. The hope is that fertility will be very similar within zones and very different between zones.

Smart sampling is involves using spatial information (the same kinds of information used for zone sampling) to pick areas within fields where to sample site-specifically. For example, a yield map could be used to identify poorly producing areas within a field. Smart farming would involve collecting samples from these areas.

Grid sampling involves defining a regular or irregular grid and collecting samples from within each grid point. Sub-samples should be collected from each grid point and composited into one sample that will be sent to the laboratory for analysis. There are no real standards for the size of the grid, the number of sub-samples, and the radius of the area around each grid point. Once sampled, maps are produced laboratory information. Unlike the other methods described, spatial interpolation will be required to make management maps. Because points are sampled not areas, often the majority of the fields are not sampled. Interpolation procedures are used to predict soil properties between zones. The most common ones used in precision agriculture are inverse distance, nearest neighbor, or kriging. 

Multiple stage sampling  involves first a course reconnaissance sampling (e.g. grid or zone sampling). Low fertility areas could be sampled with more detailed sampling. This approach is not very common.

Regardless of the approach used, the expectation by practitioners is that all mapping procedures will yield accurate fertility maps for management. This basic premise may or may not be correct. In the past, many studies that have evaluated site-specific fertility management systems have focused on side-by-side yield comparisons of variable rate and single rate treatments. The results have been mixed. Methods exist for testing the basic premise. of whether fertility maps created from site-specific sampling yield accurately represent soil properties across agricultural fields.. Map quality studies have been conducted to asses maps created with various zone (Mueller et al., 2001) and grid sampling (Mueller et al., 2000; Mueller et al., 2001) strategies. Many of these approaches use quantitative measures of map error such as mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and correlation or regression coefficients. Some of the formulas for these measures are given in Mueller et al., 2001. While these procedures can be very useful, nothing replaces a good graphical examination of plots of predicted versus measured. In your exercises, we will be considering the RMSE and MAE.

There are different ways to evaluate these map errors including cross validation and jack-knife analysis. In Mueller et al., 2001, we refer to jack-knife analysis as cross validation with an independent data set. When you conduct jack-knife analysis, a secondary dataset is collected. The idea is that you your prediction dataset to predict the validation data set. If the predictions for the validation point are very similar measured values, that would indicate you have a good map. Collecting an additional dataset can be expensive so cross validation was developed as an inexpensive alternative. The idea with cross validation is that you use the prediction dataset to validate itself. You remove a prediction point from your data set. Then you interpolate that point with the other data in the prediction data set. Then you add that point back into the prediction data set. You repeat this for all of the points. Finally, like for jackknife analysis, you compare cross validation. For both approaches, the MSE, RMSE, MAE, and graphical plots of predicted versus measured are used to assess the quality of the predictions. Some of the work I have done has found serious limitations with cross-validation. These are discussed in Mueller et al., 2002. We will conduct cross validation and jackknife analyses in your exercise and you will compare the differences.

In this exercise, we will be considering three interpolation procedures. Inverse distance weighted, kriging, and radial basis functions. There are pros and cons of these interpolation methods. The material in blue below is credited to Sharolyn Anderson. The link can be found here (http://www.cobblestoneconcepts.com/ucgis2summer/anderson/anderson.htm).

"SPLINE

The SPLINE method can be thought of as fitting a rubber-sheeted surface through the known points using a mathematical function. In ArcGIS, the spline interpolation is a Radial Basis Function (RBF). These functions allow analysts to decide between smooth curves or tight straight edges between measured points. Advantages of splining functions are that they can generate sufficiently accurate surfaces from only a few sampled points and they retain small features . A disadvantage is that they may have different minimum and maximum values than the data set and the functions are sensitive to outliers due to the inclusion of the original data values at the sample points. This is true for all exact interpolators, which are commonly used in GIS, but can present more serious problems for SPLINE since it operates best for gently varying surfaces, i.e. those having a low variance.

Inverse Distance Weighting (IDW)

Inverse Distance Weighting (IDW) is based on the assumption that the nearby values contribute more to the interpolated values than distant observations . In other words, for this method the influence of a known data point is inversely related to the distance from the unknown location that is being estimated. The advantage of IDW is that it is intuitive and efficient. This interpolation works best with evenly distributed points. Similar to the SPLINE functions, IDW is sensitive to outliers. Furthermore, unevenly distributed data clusters results in introduced errors.

KRIGING

Similar to IDW, KRIGING uses a weighting, which assigns more influence to the nearest data points in the interpolation of values for unknown locations. KRIGING, however, is not deterministic but extends the proximity weighting approach of IDW to include random components where exact point location is not known by the function. KRIGING depends on spatial and statistical relationships to calculate the surface. The two-step process of KRIGING begins with semivariance estimations and then performs the interpolation. Some advantages of this method are the incorporation of variable interdependence and the available error surface output. A disadvantage is that it requires substantially more computing and modeling time, and KRIGING requires more input from the user."


Objective

The objective of this exercise are 1) to map quality for maps created with grid sampling and interpolation. The field we will be considering was grid sampled in 1999.


Assignment.

To complete this assignment, you will need to complete follow the exercise instructions.

1. Do you think you are better able to assess map quality with graphs of predicted verses measured or quantitative measures of map error?  Why?

2. Why does cross validation tend to over predict errors of prediction with regular grids. Refer to Mueller et al. (2001).

3. What inverse distance weighted (IDW) power gave the lowest cross validation RMSE? What IDW power gave the lowest validation RMSE?  Are they the same. If you chose your IDW power based on cross validation (pressing the Optimize Power value button) are you guaranteed to get the most optimal prediction.

4. Conduct IDW cross-validation and validation IDW interpolation for soil P. Fill in this table below for soil P (200 foot grid). How does prediction quality of soil P compare with pH. How do optimal RMSE values compare. Did prediction quality vary substantially between pH and P and if so, why might this occur?

5. How does map quality and the values for BpH and K compare to each other and with pH and P. Overall how good were the maps in Field 16.

6. Now do the same thing for the  XYField16_300 files and fill in the tables below.  How has scale of measurement (e.g. 200 foot grid vs. 300 foot grid) affected map quality. Also graph these relationships. What kind of advice would you give to a farmer about grid sampling intensity based on the based on the 200 and 300 foot data.

7. How did scale affect prediction quality. Most (but not all) farmers that grid sample, do so at a grid increment greater than 300 foot. What grid increment would you recommend to a farmer.

8. Was there a reduction in map error (RMSE) when modeling anisotropy. Were the plots of predicted versus measured very different. How does kriging compare to the other radial basis function and to inverse distance weighted. What kind of advice would you give to a farmer about interpolation methods.

 9. Make a printout of a P interpolation to hand in with your home work.  I will tell you during class which interpolation technique you should use.


References

Mueller, T.G., F.J. Pierce, O. Schabenberger, and D.D. Warncke. 2001. Map quality for site-specific management. Soil Sci. Soc. Am. J. 65: 1547-1558

Mueller, T.G., K.L. Wells, G.W. Thomas, R.I. Barnhisel, N.J. Hartsock, A. Kumar, C.R. Dillon, S.A. Shearer. 2000. Soil Fertility Map Quality: Case Studies in Kentucky.  In. P.C. Robert et al. (ed.) Proc. 5th international conference on precision Agriculture. ASA Misc. Publ., ASA, CSSA, and SSSA, Madison, WI.