Spatial Data Mining - GIS and Mapping - Exam, Exams of Geology

In these Exam Notes, following were the main queries asked : Spatial Data Mining, Visualizing Map Comparison, Graphic And Embed Below, Summary Table, Comparing Discrete Maps, Comparing Continuous Surface Maps, Identifying Unusual Areas, Calculating A Similarity Map, Calculating A Cluster Map, Deriving A Dependent Variable Map

Typology: Exams

2012/2013

Uploaded on 07/23/2013

raksha
raksha 🇮🇳

4.4

(10)

87 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Exercise #9 — Spatial Data Mining
GIS Modeling
Team Members _____<enter names >_____
Date _____<enter date>______
Part 1 – Visualizing Map comparisons (the following questions use Agdata.rgs database)
Complete the following processing and include your responses immediately below each
question.
Access MapCalc using the Agdata.rgs database. Generate
side-by-side 2D Lattice displays of the 1997_Fall_P (phosphorous) and 1997_Fall_K
(potassium) maps as shown. Be sure the two maps use the SAME LEGEND (Hint: use
“User Defined Ranges” and the “Save Template” options in the Shading Manager).
Question 1. Screen-grab the composite graphic and embed below.
<insert screen grab>
In general terms, describe any similarities and differences you” see” (visual
interpretation) in the spatial patterns of the P and K maps.
< insert discussion >
Screen grab and embed the Shading Manager summary table for both maps with the
Histogram tab selected. Discuss the similarities and differences between the two maps
reported in their summary tables.
< insert screen grabs and discussion >
Part 2 – Comparing Discrete Maps
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Spatial Data Mining - GIS and Mapping - Exam and more Exams Geology in PDF only on Docsity!

Exercise #9 — Spatial Data Mining

GIS Modeling

Team Members __________ Date ___________

Part 1 – Visualizing Map comparisons (the following questions use Agdata.rgs database)

Complete the following processing and include your responses immediately below each question.

Access MapCalc using the Agdata.rgs database. Generate side-by-side 2D Lattice displays of the 1997_Fall_P (phosphorous) and 1997_Fall_K (potassium) maps as shown. Be sure the two maps use the SAME LEGEND (Hint: use “User Defined Ranges” and the “Save Template” options in the Shading Manager).

Question 1. Screen-grab the composite graphic and embed below.

In general terms, describe any similarities and differences you” see” (visual interpretation) in the spatial patterns of the P and K maps.

< insert discussion >

Screen grab and embed the Shading Manager summary table for both maps with the Histogram tab selected. Discuss the similarities and differences between the two maps reported in their summary tables.

< insert screen grabs and discussion >

Part 2 – Comparing Discrete Maps

Generate a “Coincidence_summary” map between categorized 1997_Yield_volume and 1998_Yield_volume data using the Statistics tab in the Shading Manager table to identify the average ( Mean ) and standard deviation ( St. Dev. ) for both maps.

First create the two categorized maps ( 1997_Yield_classes and 1998_Yield_classes ) for the two periods by renumbering the base maps to the binary progression of map values as indicated below—

1997_Yield_ Low = 1 = less than 1 StDev below the mean 1997_Yield_ Medium = 2 = between -1 StDev and + 1 StDev 1997Yield_ High = 4 = more than 1 StDev above the mean 1998_Yield_ Low = 8 = less than 1 StDev below the mean 1998_Yield_ Medium = 16 = between -1 StDev and + 1 StDev 1998_Yield_ High = 32 = more than 1 StDev above the mean

…then use “Compute plus” to combine the two maps for a Coincidence_summary map. Be sure to display the maps in as a Discrete data type with appropriate colors and labels for each of the categories.

Question 2. Embed screen-grabs of the two categorized maps and the Coincidence_summary map…

Complete the Coincidence Summary Table below using the summary cell counts in the Shading Manager table of the Coincidence_summary map you generated…

Coincidence Summary Table 1997_Yield_classes Map

1998_Yield_classe s Map

Low Medium High Total Percent Low Medium High Total -- Percent --

Note: For the main portion of the table, Total = sum of column and row counts and Percent same = ((#Same / Total) *100). For the overall portion of the table (boxed area), Overall Total = sum of Total column (or row) and Overall Percent Same = (Sum of #Same) / Overall Total) * 100)

Based on discussions in lecture and readings, briefly discuss the information in the “Coincidence Summary Table.” Be sure your discussion includes comments on the relative percentages of coincidence for the diagonal elements (Same, no change), Above Diagonal elements (Decrease) and below Diagonal elements (Increase)

< insert discussion >

Part 4 – Identifying Unusual Areas

Using the Statistics tab in the Shading Manager table to identify the mean (Average) and standard deviation (StDev) calculate the following cut-off values for the 1997_Fall_P and 1997_Fall_K maps to identify areas of unusually high and low phosphorous and potassium in the field:

Question 4.

Unusually high P cutoff = P_mean + P_stdev = ___ < insert answer>___ Unusually low P cutoff = P_mean - P_stdev = ___ < insert answer>___

Unusually high K cutoff = K_mean + K_stdev = ___ < insert answer>___ Unusually low K cutoff = K_mean - K_stdev = ___ < insert answer>___

Using the 1997_Fall_P and 1997_Fall_K maps, create four binary maps identifying areas of unusually High_P= 1 , Low_P= 2 , High_K= 4 and Low_K= 8 with the background on all four binary maps = 0. Create a compound graphic containing all four maps with appropriate lables.

Embed the compound graphic showing all four maps…

<insert screen grab(s)>

Use Calculate to combine all four maps such that unique values and labels identify all possible combinations of the four unusual levels of P and K in the field. Be sure to label the 0 value condition.

Embed a screen-grab of the “unusual areas” map with labels…

Based on discussions in lecture and readings, briefly discuss the “level-slicing” processing you just completed and the information it generated. Be sure your discussion includes comments on both visual interpretation and summary statistics of the “unusual areas” results.

< insert discussion >

Part 5 – Calculating a Similarity Map

Maximize the 1997_Fall_P window, switch to grid display format and add the layer mesh for reference. Move the cursor to the field location Column= 8 , row= 24 using the coordinate reference in lower left corner of the screen. Double-click at this location and note the 1997_Fall_P and _K and values— 96.5 and 160 , respectively—that represents the “data pattern” for the location.

Select Map AnalysisStatisticsRelate and complete the dialog box as shown to calculate a PK_similarity map for the map location 8c, 24r. Set the display to a 2D continuous grid with a color ramp of 10 intervals in steps of 10 from 0= red to 100= green and with a yellow inflection point at 40 to 50.

Question 5. Screen-grab the PKN Similarity map and its shading manager table with the statistics tab showing and embed below…

Complete the flowing table for the map locations identified using the Data Inspection tool by double-clicking on the PKN Similarity map and moving the cursor over the locations noted blow.

Note: the Column, Row locations as you move the mouse over the map are indicated at the top of the “drill-down” data table.

Similarity Table

Column Row 1997_Fall_P Value

1997_Fall_K Value

Similarity Value Base Point 8 24 96.5 160 100. Location 1 8 25 Location 2 30 12 Location 3 7 43 Location 4 49 22

groupings as shown below—cluster 1= red and cluster 2= green for 2 clusters; original cluster 1=red, original cluster 2=green and new cluster 3= blue for 3 clusters; and original cluster 1= red, original cluster 2= green, original cluster 3= blue and new cluster 4= yellow for 4 clusters)…

<insert screen grab(s)>

Discuss any relationships you detect among the three maps. (Aside: don’t do the Composite analysis to determine how separable the clusters are).

< insert discussion >

Part 7 – Deriving a Dependent Variable Map (the following questions use Smallville.rgs database)

Use the View tool (binoculars) to generate a display of the Loan_accounts map in the Smallville.rgs database; screen grab and embed below.

Select Map AnalysisNeighborsScan and complete the following command.

SCAN Loan_accounts Total IGNORE 0.0 WITHIN 15 CIRCLE FOR Loan_concentration

Use the Shading Manager to theme the map with 35 User Defined Ranges (1 unit interval) from 0 (light grey) to 1 (red) through 34 to 35 (green) with a yellow color inflection at mid-range; screen grab and embed below.

Question 7. Screen grab and embed the displays of the Loan_accounts and Loan_concentration maps below (be sure to include a caption with a short description for the processing and results).

<insert screen grab(s) and discuss>

Part 8 – Generating Scatter Plots of Map Correlation and Univariate Regression Equation

Generate a Scatter Plot and Regression equation for Loan_concentration and each of the HousingDensity_surface , HomeAge_surface and HomeValue_surface maps by selecting Map SetNew GraphScatter Plot and complete the dialog box as shown.

Question 8. Screen grab and embed the scatter plots below…

Considering Loan_concentration as the Dependent variable (Y axis) and HomeValue_surface as the Independent variable (X axis), what is the 1) linear regression equation calculated for the two maps and 2) its R-squared value?

  1. Regression equation=

Identify, interpret and discuss the “slope” and “intercept” parameters of the equation. In your discussion, include an example prediction calculation of loan concentration if home value = 300 (thousand dollars).

< insert discussion and example calculation>

  1. R-squared index=

Interpret and discuss the R-squared value.

_________________________________

Submit Optional question answers as separate Word document files with the Question number and your name (e.g., Optional_Ques9_1_Berry.doc)…do not include them with the normal weekly lab reports.

Optional Question 8-1 (3 extra credit points possible). Evaluating Regression Performance

Using the Loan_concentration and Loan_prediction maps you generated in in Parts 8 and 9 above, select Map AnalysisOverlayCalculate and complete the following command to calculate the difference between the actual and predicted loan concentration maps.

Question 8-1a. Screen grab and embed the Error map and its Shading Manager table with Statistics tab displayed below…

Briefly describe the error map in terms of its numeric and spatial distributions.

< insert discussion >

Question 8-1b. Select ReclassifyRenumber the Error map to generate a new map called Error_classes that isolates three error zones— Zone 1 – Unusually high over estimate (greater than +1SD) Zone 2 – Mean +/- 1 StDev Zone 3 – Unusually high under estimate (less than -1SD)

Screen grab and embed the Error_classes map and its Shading Manager below…

Based on the map and table you generated, how successful do you think the prediction equation is?

< insert discussion >

How do you think it might be used in business planning?

< insert discussion >

Optional Question 8-2 (3 extra credit points possible; note: you must complete Optional Exercise 8- before beginning this exercise). Complete the following processing—

 RENUMBER Error_classes ASSIGNING PMAP_NULL TO 2 THRU 3 FOR Error_class  COMPUTE Loan_concentration Times Error_class1 FOR Loan_concentration  COMPUTE HousingDensity_surface Times Error_class1 FOR Housing_density_surface  COMPUTE HomeValue_surface Times Error_class1 FOR HomeValue_surface  COMPUTE HomeAge Times Error_class1 FOR HomeAge_surface  REGRESS Loan_concentration1 WITH Housing_density_surface1, HomeValue_surface1, HomeAge_surface1 TO D:\Temp\NewTextFile.txt FOR Loan_prediction

…to generating a Stratified Regression Equation for the error class 1 region derived in Part 4.

Enter the Entire Project Area (from Exercise 9, Part 3) and the Error Class 1 regression equations (calculated using above steps) in the table below.

Regression Equation Entire project area equation (Part 3) Error Class 1 equation (above)

Briefly discuss any similarities or differences you note in the equations…

< insert discussion >

Evaluate the Error Class 1 regression equation for the entire project area to generate a Loan_prediction1 map and an Error1 map. Use the Error_class1 map to as a mask, and then screen grab and embed the maps in the table below.

Comparison of Prediction/Error Maps (masked for Error1 zone) Using Entire Project Area Equation Using Error_Class1 Equation

Based on the Shading Manager summary statistics complete the following table:

Proximity to Roads Classes

Housing Density Classes

Low (0-3 cells away) Medium (3-7) High (>7) Totals Low (0-10 houses) Medium (10-20) High (>20) Totals

Briefly discuss the coincidence statistics in the table and how general relationships between the two maps can be interpreted…

< insert discussion >

Scatterplot/Regression Approach. Embed key intermediate maps and scatter plot:

<Insert Scatterplot Road_Prox (X) versus Housing_density (Y)>

Complete the following table:

Regression Equation R-squared:

Briefly discuss the regression/correlation statistics in the table and how general relationships between the two maps can be interpreted…

< insert discussion >

Optional Question 8-7. (3 extra credit points possible). Complete the following processing and write-up of the results—

  1. Generate a map showing three clusters using the HousingDensity_surface , HomeValue_surface and HomeAge_surface maps in the Smallville.rgs data base. Enter your results in the “master table” at the end of this question.
  1. Derive a multivariate regression equation and its prediction map for Cluster #1 area using the HousingDensity_surface , HomeValue_surface and HomeAge_surface maps in the Smallville.rgs data base as independent variables and Loan_concentration as the dependent variable. Enter your results in the master table.

Hint: you need to create a binary map to mask the cluster areas by Renumber the cluster map assigning 1 to Cluster #1 and PMAP_NULL to everything else. Use this map to isolate the data for the independent and dependent maps before deriving the regression equation and prediction map.

  1. Repeat the processing to derive regression equations and prediction maps for Clusters #2 and #3 areas. Enter your results in the master table.

Hint: be sure you use a consistent map legend for all of the prediction maps

  1. Combine the three individual prediction maps into a single prediction map.

Hint: use the Cover command.

  1. Derive the multivariate regression equation and its prediction map for the entire project area as described in part 4. Enter your results in the master table.

  2. Visually compare the results of the Combined and Entire predictions maps and comment on similarities and differences you see at the end of the master table.

<insert Cluster Map from step 1> <insert caption/description> <insert Regression equation for Cluster 1 from step 2>

Loan_concentration Map

<insert Cluster #1 Prediction map> <insert Regression Equation for Cluster 2 from step 3>

Loan_concentration Map

<insert Cluster #2 Prediction map> <insert Regression Equation for Cluster 3 from step 3>

Loan_concentration Map

<insert Cluster #3 Prediction map> Visual Comparison of Combined and Entire Prediction Maps

Loan_concentration Map

<insert Combined Prediction map; step 4>

Loan_concentration Map

<insert Entire Prediction map: step 5>

<insert your comments on similarities and differences you see; step 6>

2000 Yield Volume Map Visual Comparison of Combined and Entire Prediction Maps

<insert actual 2000_yield_volume map> 2000 Yield Volume Map

<insert Combined Prediction map; step 4>

<insert actual 2000_yield_volume map> 2000 Yield Volume Map

<insert Entire Prediction map: step 5>

<insert your comments on similarities and differences you see; step 6>