SP-399 SKYLAB EREP Investigations Summary

[377] APPENDIX D

Principles of Photographic and Digital Data Analysis

In this appendix, the methods and terms most commonly used in the analysis of photographs and images, and in the preprocessing and analyzing of multispectral scanner digital data are discussed. The analysis of microwave data requires highly specialized techniques; therefore, detailed discussion of procedures and methods for analyzing such data is contained in section 6.

Fundamentals of Photographic Interpretation 1

ROBERT N. COLWELL a

a University of California at Berkeley.

Photographic interpretation involves the systematic examination of negative or positive prints and transparencies for the purpose of identifying objects and judging their condition or significance. This process requires planning a sequence of activities which includes defining the data geometry, enhancing the data, identifying the data characteristics, and interpreting the results.

PHOTOGRAPHIC GEOMETRY

The first step in photo-interpretation is to establish the method of acquiring the data and its significance in reference to a set of known features or relationships. By this means, the interpreter can define the general proportional perspective of features on the photograph and can form a standard for measurement of specific objects.

The geometric relationships among film negative, camera lens, and target for vertical photographs are illustrated in figure D-1. Three principal points that define the positions a, b, and c on the film correspond to....

FIGURE D-1.- Diagram illustrating the imaging of ground objects in vertical aerial photographs.

[378] ....points A, B, and C on the imaged surface. The perpendicular distance f from the camera lens to the film is the focal length of the camera.

The scale S is the relationship between a distance on the photograph and the corresponding distance on the ground. For example, in figure D-1,

(D1)

The larger the denominator of the fraction, the smaller is the scale of the photograph. The scale of the photograph can also be determined from the relationship between the camera focal length f and the camera altitude Hat the instant of photography. Then,

(D2)

For example, the scale of the original film image of the Skylab Earth Terrain Camera (S19OB) with a 45.7cm focal length, taken at a spacecraft altitude of 435 km,

This fraction indicates that one measurement unit on the photograph corresponds to 951 860 of the same units on the ground. Thus, the scale of the photograph is approximately 1:950 000.

PHOTOINTERPRETATIVE EQUIPMENT

Photointerpreters use equipment for three general purposes: viewing, measuring, and transferring or recording detail. Measuring instruments may be used on single photographs or stereoscopic pairs. Viewing equipment is used to increase the interpreter's ability to scan or study photographs. Some of these instruments provide a stereoscopic (three dimensional) view of overlapping photographs under various magnifications, whereas others provide only a two-dimensional but magnified view of objects on photographs. Light tables aid the photointerpreter in screening and selecting photographs, judging the importance or quality of the photographs, and examining transparencies or negatives. The stereoscope, a binocular viewing instrument that through a combination of lenses, mirrors, and prisms provides a three-dimensional view of photographs, is one of the most important instruments used in photointerpretation. The stereoscopic principle is used in measuring and plotting instruments that are designed primarily for viewing photographs and/or images generated from multispectral scanners (e.g., S192) or enhancement of photographs. Measurement is also done with a hand lens on which a scale has been etched. Various ruler-type and caliper-type scales are also used. To transfer information from the photograph to a base map or descriptive chart, optical and mechanical devices are used. In many Earth Resources Experiment Package (EREP) experiments, zoom transfer scopes were used to optically converge the scale of the photograph with that of the base map. Optical devices were used to enlarge or reduce the photographs and aid in the transfer of data. Proportional dividers and pantographs can also be used to transfer small amounts of detailed information.

PHOTOGRAPH ENHANCEMENT

Enhancement devices are equally as useful on photographic and electronically derived images for increasing the amount of information that can be interpreted by the analyst. Forms of enhancement include density slicing, color coding, and combining multiple images into a single composite. Some analysts advocate the use of multiple lantern-slide projectors or other optical devices for this purpose, whereas others prefer electronic devices such as closed-circuit television equipment. A combination of optical and electronic devices is especially useful when several multidate or multiband images of the same area, each containing different information, are available.

In addition to the advantages common to both optical and electronic enhancement techniques, some important relative advantages and limitations are associated with each technique. Generally, optical enhancement methods and techniques provide better spatial resolution (i.e., higher definition) and enable the use of simpler and less costly equipment. However, electronic methods and techniques offer advantages based on their capability to quickly select and mathematically expand data, to assign color hues, to combine multidate and multiband data for analysis, and to geometrically correct an image or a photograph.

[379] PHOTOGRAPH CHARACTERISTICS

The synoptic view of Earth provided by space photo" graphs requires the analyst to revise his concepts of the significance of shape, size, shadow, tone, color, texture, and pattern characteristics of objects in the image. As an example, much of the EREP data consists of vertical photographs covering in excess of 26 000 km2 of the Earth surface. From this new perspective, some features assume greater importance in space photographs than in the ground view and details that may predominate in the ground view may be nearly indistinguishable from space.

The shape of an object as seen in the plan view portrayed by a vertical photograph provides an important and sometimes conclusive indication of its structure, composition, and function. For example, in the vertical view of a forest, its economic and recreational value may be apparent. The vertical view of a landform may show effects of tectonic and erosional processes. To the motorist, a cloverleaf intersection is an incomprehensible maze; to the analyst, however, the form and function of the intersection are clearly evident. Shape is valuable to the interpreter because it establishes the class of objects to which an unknown object must belong; shape frequently enables a conclusive identification, and it aids in the understanding of significance and function of the object.

The size of an object is one of the most useful clues to its identity. By measuring an unknown object on a photograph, the interpreter can eliminate from consideration groups of features or phenomena. Size can also be used to formulate sets of possible inferences that may help to identify significance. For example, measurement of the length of a runway can indicate whether or not an airfield can accommodate large jet aircraft.

Shadow in vertical photographs sometimes aids the interpreter by providing dimensional representations of landforms or other objects of interest. Although shadows can affect the photointerpreter's depth perception, they are often an invaluable aid in estimating heights or depths of objects. In nearly flat terrain, subtle variations on the surface which would otherwise be difficult to detect are emphasized by shadows. However, because objects in the shaded area reflect so little light to the spacecraft camera, they are rarely visible in space photographs.

Tone and color perception are important elements in the analysis of photographs. On black-and-white photographs, distinctions between objects are observed only in tones of gray. On color photographs, hue brightness and saturation as well as tone can be used to distinguish objects. The color of surface objects, particularly when viewed from space, rarely corresponds to the ground-level perceptions. A body of water may appear in tones ranging from white to green to black, depending on the Sun angle, the viewing angle, and the condition of the surface reflecting light to the camera lens. A black asphalt road may appear very light in tone because of its smooth surface. When the photointerpreter understands the factors that govern the photographic tone or color of objects of interest, these characteristics become major clues to their identity or composition.

Texture in photographs is the result of tonal repetitions in groups of objects too small to be discerned as individual objects. Thus, the size of an object required to produce texture varies with the scale of the photograph. In large-scale (1:5000) photographs, trees can be seen as individual objects and their leaves or needles, although not seen separately, contribute to the texture of the tree crowns. Similarly, in small-scale (1:250 000) photographs, the individual tree crowns, although not seen separately, may contribute to the texture of the whole stand of trees. Thus, the texture of a group of objects (e.g., a timber stand of a certain species composition) may be distinctive enough to serve as a reliable clue to the identity of the objects.

Scientists have stressed the pattern, or the spatial arrangement and association, of objects as an important clue to their origin, to their function, or to both. Geographers and anthropologists study settlement patterns and their distribution to understand the effects of diffusion and migration in cultural history. Outcrop patterns provide geologists clues to geological structure, and drainage patterns have orderly association with structure, lithology, and soil texture. The varying relationships between vegetative elements and their environment produce some characteristic patterns of plant association.

Many regional patterns and associations that formerly could be studied only through laborious ground observation are clearly and quickly visible in space photographs. Moreover, these photographs may capture many significant patterns, such as fracture traces and tonal "anomalies" that the ground observer might overlook or misinterpret because of his limited field of view. The trained observer appreciates the significance of space photographs chiefly through his understanding of patterns and associations on the Earth surface.

[380] INTERPRETATIVE ACTIVITIES

The interpreter observes characteristics of the photograph or image and determines the identity and significance of the objects they represent. The analysis process occurs in a time sequence beginning with a search process that results in the detection of important features, some of which may require measurement. Measurement is followed by consideration of the features in terms of collateral information, usually nonpictorial, from the interpreter's special field of knowledge. On the basis of these actions, hypotheses concerning the identity and significance of the features are formed. Finally, the interpreter must evaluate the identity and significance. In some instances, it may be possible to perform field checks to validate his deductions as to the identity of an object. Thus, the five important sequential activities of photointerpretation include search techniques, the use of indicators, measurement, deductive reasoning, and field checking.

Interpretation begins with close examination of all details that are considered relevant. However, most experienced interpreters prefer to begin by scanning the photograph or image as a whole. It is usually necessary to study the photograph or image with reference to an index map or a photomosaic which serves as an index map. A large-scale map, preferably topographic, is useful as a base map.

Many characteristics may provide indications as to the identity of an unknown object. No single indicator is likely to be infallible; but if all or most of the indicators lead to the same conclusion, the conclusion is probably correct. Photointerpretation, therefore, is highly dependent on the science of probabilities. The principle involved, known as "convergence of evidence," requires that the interpreter first recognize basic features or types of features and then consider their arrangement (pattern) in the areal context. Several alternative interpretations may be possible. With the aid of photointerpretative keys, critical examination of the evidence usually reveals that all interpretations except one are unlikely or impossible.

Photointerpreters can measure the exact dimensions of features using scales and other instruments. Generally, however, photointerpretative measurement involves making visual estimates of the size and shape of an object. A reasonably correct estimate of dimensions is essential to correct identification. Plotting and drawing to known scales may also be regarded as further activities of measurement.

By means of deductive reasoning based on the previously described activities, the interpreter can identify objects and features on the photograph or image. On the basis of these deductions, the interpreter documents his response by labeling (naming or describing) the identified features. The labeled products are often called thematic or classification maps.

Much scientific knowledge has been tested by the patient correlation of photographed or imaged features with ground features by means of careful field checking. Many established correlations are taught as basic principles in the various photointerpretation disciplines. Nevertheless, in almost every interpretative task, unknowns or uncertain conclusions will arise that must be checked in the field. The interpreter should conduct field checking whenever feasible to validate his work. Some types of work require field correlation before and after the interpretation. The amount of fieldwork necessary varies with the requirements for accuracy, the complexity of the area, the quality of the photographs or images, and the ability of the interpreter.

1 Positive prints and transparencies are often referred to as "images." In this report, "image" or "imagery" is defined as a print or a transparency generated from electronic data (i.e., multispectral scanner).

Digital Analysis Techniques

ROGER M. HOFFER b AND A. VICTOR MAZADE c

b Purdue university
c Lockheed Electronics Company, Inc.

Current procedures for processing and analyzing digital data from the EREP Multispectral Scanner (S192) or similar systems involve four primary activities: preprocessing, display and enhancement, analysis and classification, and evaluation. Although many variations and combinations of these activities provide a flexible analytical system, this section is intended to give a very brief overview of techniques used by many investigators.

[381] PREPROCESSING

Preprocessing involves various manipulations that make the data more usable. It is important to note that no actual data analysis or interpretation occurs in this phase of the activity. A major preprocessing activity performed on the digital data is editing, in which the portion of the data of particular interest to investigators is located and extracted from the total data set. Optical-mechanical systems record scanner data in either a straight or a conical scan-line configuration. Landsat is typical of the former, whereas EREP (S192) is characteristic of the latter. Because most users of digital data do not have the capability to display conical scan-line formats, a special preprocessing of the S192 data was required to display the data in approximately the correct geometric output format. Because geometric distortion across the scene makes the location of specific geographic features difficult, the conical scanline data were resampled and data tapes reconstructed so that the data could be displayed using straight scanline display equipment. This process enabled the users to display the data with their own equipment and still obtain output products having reasonably accurate geometric fidelity. The third major task of preprocessing and reformatting involves digital filtering of the data to improve the data quality (signal-to-noise ratio). Preliminary work with the S192 data indicated that some wavelength bands were extremely noisy. These noise patterns were complex and included both systematic and random noise. To decrease the effect of the systematic noise, a series of digital noise filters was developed, and much of the S192 data was preprocessed with the digital noise filters.

For selected Earth resources studies, some investigators have subjected their data sets to a geometric correction and rotation sequence designed to correct for the orbital path of the satellite and to enable display of the data at a geometrically correct scale. Other investigators have performed additional data preprocessing to adjust the data values to a particular discipline requirement, including merging the data for bands with two scientific data outputs, or have transformed the data by mathematical methods into an entirely new set of values. A few investigators have registered their scanner data to other data sets, such as Landsat-l data or U.S. Geological Survey topographic maps.

DISPLAY AND ENHANCEMENT

The display of computer-enhanced data can assume several forms. One of the more common procedures is to illuminate imagery of three individual wavelength bands of the original satellite data through appropriate color filters to obtain false-color composites. Using digital data, channels can be assigned to the three color guns of a cathode-ray device and thus create a false-color-composite image. This approach has been successfully used for the Landsat data. Numerous combinations of false-color-composite images can be obtained.

A variety of mathematical functions can be applied to the digital data to enhance display. The data value range may be expanded or compressed to increase or decrease contrast, or data value ranges may be segmented with colors assigned to identify different ranges. Data values may be combined by addition and subtraction or may be ratioed to emphasize degrees of similarity or difference between channels. By taking mathematical transforms of groups of channels, entirely new data sets may be formed to isolate a particular feature in the data.

ANALYSIS AND CLASSIFICATION

During the past decade, considerable progress has been made in the development of computer-aided analysis techniques involving the application of pattern recognition theory to multispectral scanner data. The basic procedure used for analysis of multispectral scanner data normally includes the following steps.

1. Definition of the parameters of the classification problem

2. Selection of a classification technique

3. Identification of areas within the scene for which reference data (i.e., ground truth) are known

4. Calculation of various statistical parameters for the area of interest

5. Classification of the data into spectral classes

6. Display and/or tabulation of the classification results

The first step in classification consists of the formulation of a set of objectives against which the final results will be measured, the establishment of a procedural analysis plan, and the selection of a data set.

[382] The objectives may include such items as the desired accuracy standards or informational content of the final product. The procedural plan outlines the major steps and contingencies to be followed and defines the classification algorithm to be used. A data set that is appropriate for the objectives is selected. The Set may include a small area about which detailed information is desired, may be a test area from which inferences to a much larger area are possible, or may actually include a relatively large geographic area (i.e., several thousand square hectometers)

Sample areas for which detailed reference data (i.e., ground truth) are available are identified in the data set to develop a truly representative set of training statistics. Reference data may also include results of field surveys or radiometric measurements made at the time of data collection, as well as information from photographs and maps. The sample areas are located within the data set and labeled for future use. Some of the sample areas are designated "training areas," and the reflectance values of these areas are used to define the classification statistics. Other sample areas, designated test areas, are reserved for evaluating the accuracy of the final product.

The two basic classification techniques are supervised and unsupervised. In the supervised technique, the computer, in essence, compares the statistical parameters of each point (picture element or pixel) with the statistical parameters of known surface features selected by the analyst. Based on probability decisions defined by the classification algorithm used, the computer assigns each data point to the most similar defined feature type. This supervised classification technique is used when the features of known interest are easily located in the data set and are homogeneous in character. For example, when an analyst knows that a particular agricultural crop was imaged in a specific portion of the data, this area can be identified and used as a "training field" from which spectral reflectance values can be determined. Such training fields are defined for several crop species and cover types. The data values for each remaining data point in the scene then are compared to the reflectance values of the training data and classified, on the basis of the probability parameters, into one of the crop species or cover-type categories defined by the training data. After classification, the analyst may also assign a "threshold" parameter that defines the maximum amount of difference acceptable between the data point reflectance values and the reflectance values of the known feature type. If the probability decision falls below a defined threshold level, the data are displayed as a blank.

In the unsupervised technique, often called "clustering," the data points are classified on the basis of similarity to other data points in the scene. The computer examines the spectral signature of each data point in the scene and then statistically divides the entire scene into the number of spectral classes or groups specified by the analyst. This technique is used when features of known interest cannot be specifically located in the scene or are not homogeneous. For example, an analyst interested in a wild-land area containing a very complex mixture of cover types may program the computer to classify the scene into 12 or 16 groups having similar spectral characteristics. After the classification has been performed, the analyst attaches a significance to the classes on the basis of some other reference information, such as an aerial photograph.

Some analysts use a combination of the two techniques to take advantage of the special features of each. Regardless of the technique, it is important to recognize that, in most cases, the results obtained are as much a function of the manner in which the analyst has interfaced with the data as they are of the particular algorithm being used. Analyst skills in quantifying the overall objectives and in understanding the computer-processing system are of critical importance in the effective use of computer-aided analysis techniques.

Classification of the data is accomplished by the computer, using one of several possible algorithms available. The maximum likelihood based on Gaussian distribution has been commonly used in the past and was the algorithm used in several of the Skylab investigations. Other decision strategies can be used that employ functions based on linear discrimination or geometric proximity (nearest neighbor). The computer time required for the actual classification task may range from a few seconds to several minutes, depending [383] on the number of wavelength bands in the data set, the number of spectral classes defined, the type of computer, and the efficiency of the software. It is also important to recognize that the classification can be extended to relatively large geographic areas. It is these types of classification tasks, involving thousands or millions of square hectometers, that most effectively use the power of the computer (e.g., rapid classification and tabulation of large quantities of data).

Display of the classification results is normally accomplished by using either an image-type or a tabular format. Image-type formats, often obtained from a standard computer line printer, print out a symbol that is distinctive for each classification category. The symbols are arranged in the sequential fashion of the data and present a geographic distribution of the results. Direct image formats provide similar information but use cathode-ray tubes or film recorders to display the results, often with colors to indicate the various cover types and to identify their location. This technique is useful for comparing the results with maps or aerial photographs.

Tabular formats may be summary statistics that indicate the number of data points classified into each category. These formats are used when estimates of the total area of a class type are desired, such as the number of square hectometers of hardwood forest in the scene. Because each data point or resolution element of satellite data represents an area on the ground (approximately 0.56 hm2/resolution element for the reformatted S192 data), a conversion factor is applied to determine the number of square hectometers in each cover type of interest. The percentage of the entire classified area, as well as the area covered by each of the species or cover types of interest, can be rapidly calculated.

ANALYSIS EVALUATION

Analytical results are evaluated using both qualitative and quantitative techniques. The analyst may obtain a quick subjective impression by comparing the display of the classification results with an aerial photograph or a thematic map. When satisfied that the product is generally acceptable, he then undertakes a more objective evaluation. Quantitative evaluation is extremely important for assessing the accuracy of the classification obtained at different times of the year or for comparing results obtained from the use of different combinations of wavelength bands in the analysis.

In the most common approach, test areas reserved during the preliminary analytical stages are evaluated by comparing the known reference information with the classification results. Any data point classified into the same category as defined by the reference information is called "correct"; others are errors. Standard statistical techniques are then used to determine the quantitative accuracy and significance based on the number of correct and erroneous data points in the test area. Depending on the number and adequacy of the test areas, the tested accuracy can be projected to the entire scene.

Another test can be performed by comparing the total area classified in a category with the total area derived from some other source, such as census-type data or areas obtained by interpretation of aerial photographs. Classification errors for individual data points may be averaged over the entire scene. For example, if the number of "forest" data points misclassified as "other" is equal to the number of "other" data points misclassified as "forest," the total area of the forest may be correct even though numerous individual data points may be misclassified.

Many researchers believe that the major potential ad. vantage of digital processing is the quantitative nature of the available information. Preprocessing activities facilitate accurate formatting of the geometric and radiometric quality of the data to ft the user's specific requirements. Display and enhancement techniques can be used to emphasize features of particular interest. Analysis and classification can be accomplished in a consistently accurate and systematic manner. Statistical evaluation steps indicate the degree of classification accuracy within a scene and among different scenes. With these tools, the computer-assisted analyst can make scientifically valid, objective decisions concerning the Earth's resources.