The Creating Contours step requires the Spatial Analyst extension.
(From Access help menu)
In this exercise, you will be using a sample environmental database for an oil refinery located in Marcus Hook, PA. This database is set up like any other relational database, but contains information specific to environmental data at this facility. This database contains information such as sampling point locations, groundwater levels, and chemical analytical results. We could determine, for instance, the location of a specific well, information about that well, the groundwater levels that have been observed at the well, and the chemical concentrations that have been measured at the well. Databases work in both directions, so we could also find all the locations at which a specific chemical of concern (COC) was measured. Pretty cool!
With an environmental database, it would be beneficial to not only create reports and queries, but also to view and analyze this information. We can query concentration measurement locations in Access, but it is difficult to visualize these locations from a table of data. That is where ArcView becomes a very useful tool because of both its display and data storage capabilities. Let us see how this works.
Here is the file breakdown:
Arc: import cover bound.e00 boundary
Arc: import cover monit.e00 wells
Arc: import cover sourcearea.e00 sourcearea
Note that the file monit.e00 is imported as a new name, wells, and similarly, bound.e00 becomes boundary when imported. Usgsmap.tif is an image file. We will examine this further in the next section.
This refinery lies along the Delaware River. Marcus Hook is located in Southeast Pennsylvania, about twenty miles South of Philadelphia. The data are given in State Plane Coordinates for the South Region of Pennsylvania. This area is highly industrialized, but there is a residential area across from the northern boundary of the refinery. Your view should like something like this:
The University of Texas is currently involved with a research project at this facility, and the project goal is to develop a program for corrective action that will protect human and ecological receptors both on and off the property. The primary steps in the risk assessment process are to identify sources, source areas, transport mechanisms, points of exposure, exposure routes, and receptors. Sources are the physical structures that could cause a chemical release such as a tank or pipeline. Source areas are the locations of the highest concentrations in environmental media, such as groundwater or soil. A transport mechanism is the combination of chemical, physical, and biological processes that move a chemical from the source to the point of exposure. The point of exposure is the location at which an individual may come in contact with a chemical. The exposure route is the manner in which a chemical comes in contact with an organism, and receptors are persons that are or may be affected by a release. In this exercise, we will focus on identifying source areas based on concentration measurements taken from the soil on-site. We will also look at groundwater level measurements to identify potential transport pathways.
In order to understand these relationships, let us look at the Sample_Collection and Results tables. The Sample_Collection table is linked to the Results table through the common SAMP_ID field. Sample_Collection provides information related to each sample that has been collected, including the type of sample and location where it was obtained. The Results table displays all the concentration measurements for each sample. These concentrations are determined using specified analytical methods. View the data in any table by returning to the original "Database" window that you got when you opened the database and then double-clicking on any of the tables to see its contents. Click on each of the tables so that you can see what it contains.
More information on what each field refers to can be obtained from the Database Dictionary.
Access "Wizards" can be useful tools for designing queries, forms, and reports. However, we will use the design view to create our query, so select Design View and click OK.
You will be prompted for the tables and/or queries you want to obtain data from for your query. Add the Location, Groundwater_Levels, and Well tables and then close the window. You will then see a clipboard containing the tables you added and an empty query form. Notice that the relationships among tables are also shown on this clipboard. You can modify these relationships by double clicking on any of the links between tables. When creating queries, you need to show all tables in the relational chain even if you are not obtaining data from all of them. For instance, if you wanted to obtain the groundwater levels for a specific well and also list attributes for that well, you would need to show the Groundwater_Levels, Well, and Location tables even if you did not want to include any fields from the Location table.
From Groundwater_Levels, click on the LOC_ID field and drag it into the first empty column on the form. Also select GWL_DATE and GW_LEVEL from this table and add them to the query form. From the Well table, select TWC_ELEV, which is the top of the well casing elevation (also referred to as the top of the well riser elevation). The GW_LEVEL is the depth to groundwater as measured from the top of the well casing. By subtracting this depth from the elevation of the top of the well casing, the elevation of the water table is found, and from these data a map of water table elevations can be created. The elevations used here are referenced to the North American Vertical Datum of 1988 (NAVD88), which is the most accurate vertical datum presently available.
To calculate the elevations, we will create a new field using an equation. To do this, move the cursor to the next empty column field and type GW_ELEV:[TWC_ELEV]-[GW_LEVEL]. This expression creates a new column called GW_ELEV using the given mathematical expression.
Besides performing calculations, you can also set criteria so that the query returns only the type of data you desire. For this query, we want to limit the groundwater level measurements to a single date, so type #4/30/94# in the Criteria field for the GWL_DATE column. Also type Is Not Null in the Criteria field for the GW_LEVEL column so as to eliminate measurements with no values.
A final step that may not always be necessary but is good to do anyway,
is to remove duplicated records from your selected set. To do this, click
in the gray area within the Query Window (so that the cursor is not identifying
a particular table or field in the query), and then click on the
icon in the main tool bar. Switch Unique Values from No to
Yes and then close the window. Your completed form should look like
this:
Now click on to run
your query. You should have 36 records. Click on the x and then say Yes
to Save Changes? Call your query GW_Elevations. Be sure to include
the underscore because ArcView is not able to read spaces, and thus, without
the underscore in the table name, you will not be able to import your data
into ArcView later in the exercise. If you don't get the correct number
of records, close the query window, open a new query, and repeat the instructions
carefully. Make sure that you've added all three tables that are needed.
The Location table is needed to complete the relational linkage between
the Groundwater_Level and Wells data even though none of the query fields
are drawn from the Location table.
We are also interested in benzene concentrations at the refinery, so we will need to create another query to obtain this data. Again select New and choose Design View. Add the Location, Sample_Collection, and Results tables. Include the following fields in your query design:
From Sample Collection: LOC_ID, SAMP_ID, and MATRIX
From Location: EASTING and NORTHING
From Results: CAS, VALUE, DET_LIMIT, and UNIT
The order you put these in will not matter, but be sure your query contains these fields because they will be searched for later with an Avenue script. As criteria, type "SOIL" in the MATRIX column, "71-43-2" in the CAS column, and "UG/KG" in the UNIT column. Be sure to include the quote marks. Run the query and then save it as Benzene_Soil. You should have 177 records. The CAS field 71-43-2 is the identification number for benzene. Note that there are three tables involved in this query and two different key fields are used to connect records in these tables. The records with no data in the Value field had a measurement that was below the detection limit as specified in the DET_LIMIT field. The detection limit is the minimum concentration that can be measured in an analysis. There are 52 records with a value greater than the detection limit recorded. Note that detection limits vary depending on the analysis method and the actual constituents in a sample. Some samples with no detected concentration but high detection limits might have a value that is greater than that for a sample with a detected concentration but low detection limit. This problem is a data quality issue that will not be discussed here.
To be turned in: Write in words what has been done in the query you have just executed. Which fields are used to connect the tables? Which fields are used to select the required records? What is the query which has been executed? Which fields are picked up as ancillary information about the selected records? Choose one record from the resulting Benzene_Soil table and describe what information this record represents (utilize the database dictionary provided above).
Under the User DSN tab: click Add, select the Microsoft Access Driver, and then click Finish. For Data Source Name, type EnvAccess. Under Database, click Select and choose the location path for database.mdb. Choose OK and your driver should now appear under User Data Sources and look something like this:
Click OK to leave.
You have now built your database and are ready to import data into ArcView. From your previously opened project, open the attributes table for Wells. Notice the Loc_Id field has identifiers that are the same as the LOC_ID field in the database. If you are going to import data from a database and connect it to an existing coverage, you need to have two fields with identical values so that a link can be established.
Make the project window active and select SQL Connect under Project. In the Connection box, you should be able to scroll down and choose EnvAccess. If not, the driver is not set up properly. Click on Connect and you will see all the tables and queries that exist in the database (8 in all).
Click once on GW_Elevations to see the fields in that query. Double click on LOC_ID, GWL_DATE, and GW_ELEV. This process selects the fields we want to import. Now double click on GW_Elevations to set the input source. Criteria expressions can be added in the Where field, but this task is much easier to accomplish in Access. For Output Table, type Elevations. Your form should look like this:
Click on Query and ArcView creates a table of data using the database.
Close the window and open the Attributes of Wells table. We now want to join the data from the query with the well coverage. Make the Elevations table active and click on LOC_ID. Now make the Attributes of Wells table active and click on Loc_id. Use the Table/Join command to include the data you imported in the well attributes. If you go back to your view and use the identify tool to click on a well, you will now see the two fields you imported. The water level was not measured at some of the monitoring wells on 4/30/94 so the GW_LEVEL field is blank for them. NOTE: Each time you open this project, ArcView will requery the database and rejoin the data. This feature is particularly useful if you make a lot of changes to the database. Neat!
In the Project window, switch to the Script icon and select New. Load the text file cocvalue.ave and then compile the script. The script requires that the view to be the active document. You can use the Window menu to switch back and forth between the view and the script, but it is often easier to customize your view properties to eliminate the switching procedure. To do this, open your View and double click on any part of the blank gray area of the toolbar. You should see the customize box appear. Change the Category to Buttons. Scroll all the way to the right and click once on the help icon. Click Separator and then New to create a new icon, one space away from the help icon. Go to the table in the lower part of the window, and Double click on Click and select Script1 (Script1 has to be open in the project for this to be a valid choice). Then double click on Icon and choose an appropriate button. Your customize box should look like:
Do not hit Reset or Make Default here! Close the customize window by clicking on the x in the upper right corner and click on your new icon to run the script. When prompted to Enter the COC Query Name, type Benzene_Soil. For the output table name, type Benzene Soil Data. Finally, save the new shapefile as benzene to your working directory. What this script does is create a new point coverage in ArcView using the Easting and Northing values that were contained in the data extracted from MS Access in the Benzene_Soil query. An alternative method to using a script is to import the data using SQL Connect and then display the data with the Add Event Theme option available from the View menu. This alternative provides more flexibility when adding different types of queries.
Make the new theme visible and you will see that the benzene measurements are scattered across the site. Some areas contain many measurements and others very few. Open the attributes table and take a look at the data. The records with an empty value field are the "no detect" measurements.
To be turned in: Use the Statistics command for the Value field and report the mean, median, range, maximum, minimum, and standard deviation of the measurements. What do these values tell you about benzene concentrations measurements at the refinery?
You can find out more about how to statistically analyze the benzene data by looking at the Statistical Analysis of Environmental Data exercise for the Environmental Risk Assessment Course.
Turn on the Spatial Analyst extension from the File menu (Project window active). Open your View and make the Wells theme active. Choose Interpolate Grid under Surface. For Output Grid Extent, choose Same as Boundary. The Output Grid Cell Size, Number of Rows, and Number of Columns will be set according to the extent of the boundary theme. However, you can change these values if you want a finer or larger grid cell size. You might want to choose a 25' cell size, for example. Remember that your grid will only be as accurate as your data measurements (i.e., a fine grid will not be any more accurate than a larger grid if the amount of data points is small). Accept the defaults for your grid parameters.
You will now see the Interpolate Surface dialog box. Choose GW_ELEV for the Z Value Field and accept the default surface parameters. If you don't see GW_ELEV as an available field at this point, it means that you did not do the database query correctly using the ODBC connection through EnvAccess. You should see a grid appear of groundwater elevation measurements for the refinery. You can use the legend editor to color your grid, and there are even a few elevation color schemes. You can also load the legend, gwelev.avl, that has been provided for you. The elevation values for your grid are the elevations above mean sea level, so water will flow from the higher values to the lower ones. If you want to see the input data values for the groundwater elevations, you can use the Theme/Auto-Label command with the Wells theme. Your grid should look something like the one provided below. To save your grid, choose Save Data Set under Theme. Call your grid gwidw. Use the theme properties to change the theme name to GW IDW Surface.
To be turned in: Which well has the highest recorded groundwater level? Use the measuring tool to determine the distance across the property to the boundary line near the Delaware River. What is the maximum gradient of groundwater flow on this site? All map units, vertical and horizontal are in feet.
Follow the same procedure above, except this time choose Spline for the Interpolate Surface Method. Also choose Tension for Type with a weight of 10. The default value range will be somewhat larger for this grid, and you can use the gwelev.avl legend to allow for a better comparison with the IDW grid. Note that the grid is only accurate within the facility boundary area. If you click on a point in the Delaware River, you will notice it has a large elevation value. Its elevation should be zero (assuming the river is at mean sea level). Again check the layout below to compare your answer. Save your grid as gwspline and rename it GW Spline Surface.
Repeat this procedure for the concentration data (use Value field). For the Spline Method, use Tension for Type with a weight of 10000. Call your two grids cocidw and cocspline and rename them Benzene IDW Surface and Benzene Spline Surface. Load the legend coc.avl for both grids to obtain a better comparison. Here is what your four grids should look like:
We could also have generated contour lines instead of grids. The procedure is identical except that you choose Create Contours instead of Interpolate Surface from the Surface menu. Feel free to experiment with this option and with different weight parameters. You will notice that the weights can have a significant influence on the results. You can also overlay the contours created for a particular interpolated surface so as to more easily understand what the interpolated grid represents. If you already have an interpolated grid and wish to create the contours from that, just highlight the grid and then use Surface/Create Contours.
To be turned in: Briefly describe the benzene concentration grids. Which sample contains the highest benzene concentration? Which interpolation method do you think is better for mapping groundwater measurements and which interpolation method is better for mapping benzene concentrations? Present a copy of the map which you feel best presents the groundwater elevation data and another for the benzene concentration data.
It turns out that we have a lot of benzene soil measurements from this area. We can query these measurements by making the benzene.shp theme active and selecting Select byTheme under Theme. Select features of the active theme that Intersect the selected features of Sourcearea and choose New Set. Open the attributes table of benzene.shp and use the promote icon to view these records. You should see 28 selected records. We now want to analyze this data using Microsoft Excel, so use the Export command under File. Choose dBase for the export format and save the table to your working directory as records.dbf. ArcView will only export the highlighted records to your dBase file. Start Excel and open the table (be sure to select dBase Files for Files of type:). You should see something very familiar to the table you had in ArcView.
If the soil samples are taken from an area containing an infinite number of samples, or are taken from a finite supply and replaced, the ratio of soil samples above a threshold to the total number of samples obtained will be described by a binomial distribution:
From our data set, we can obtain the best estimate probability that any soil sample in the area will be in violation of a set threshold by dividing the number of violations by the total number of samples. You can use your Excel spreadsheet to calculate this probability. Highlight your records and then choose Data/Sort. Sort by Value in Descending order. Your first five results should look like this:
For illustration purposes in this exercise, let us use a threshold concentration of 15 ug/kg (this is actually much lower than EPA target levels for benzene in soil). The best estimate probability = (# of samples greater than 15 ug/kg)/(total # of samples). One of the benefits of this method is that it includes the no detect data. However, this method does not take into account the actual value measurements (i.e., for the given threshold, a sample with a value of 50,000 ug/kg is not distinguished from a sample with a value of 20 ug/kg). In this case, the total number of samples is equal to the number of records in the table, namely 28.
To be turned in: What is the best estimate probability for the source area data?
We now have a best estimate probability, but it will be more appropriate
to express the probability estimate as a range of possible values with
a degree of confidence that the true probability falls in that range. A
typical statement of this type is "The probability of measuring a concentration
above the target level in a single trial lies between 40% and 60% with
a confidence level of 95%." As an appropriate screening level for our data,
we would like to calculate the upper bound on our probability estimate
with a confidence level of 95% (two-sided). This means that there will
only be a 2.5% chance that the actual probability of violation will be
greater than the upper limit we calculate.
Now we are ready to solve for the upper bound. Select Tools/Solver. We want to change the first cell until it solves the equation, and then place the result in the last cell. Thus, type $J$1 under By Changing Cells, and type $L$1 under Set Target Cell. In the Subject to the Constraints section, click Add and type Cell Reference: $K$1 >= $K$2, and then click OK. Make sure Equal To: is set to Max and then click Solve. Accept the solution and the value in cell L1 is your upper bound (%). Your completed Solver form should look like this:
Now that we have an upper bound, we could add this information to
our Sourcearea attributes table in ArcView. If we had other source areas,
we could follow a similar procedure and then use the legend editor to color
the source areas according to their upper exceedance probability. This
map would be a useful tool for displaying possible risks associated with
each source area. OK, you are done!