You are here

GIS Developing Follow Me with PHP--Load GeoInfo from ESRI Shapefile format

1. Shapefile Description


Shapefiles are used widely and in many GIS software. Shapefiles can be created with the following four general methods:

1.  Export--Shapefiles can be created by exporting any data source to a shapefile using ARC/INFO, Spatial Database Engine (SDE), ArcView GIS,or BusinessMAP software.

2.  Digitize--Shapefiles can be created directly by digitizing shapes using ArcView GIS feature creation tools.

3.  Programming--Using Avenue (ArcView GIS), MapObjects, ARC Macro Language (AML) (ARC/INFO), or Simple Macro Language (SML) (PC ARC/INFO) software, you can create shapefiles within your plugin programs.

4. Write directly to the shapefile specifications by creating a program. PHP could do that perfect :P

A shapefile stores nontopological geometry and attribute information for the spatial features in a data set. The geometry for a feature is stored as a shape comprising a set of vector coordinates. Shapefiles can support point, line, and area features. Area features are represented as closed loop, double-digitized polygons.

An ESRI shapefile consists of a main file, an index file, and a dBASE table. In the index file, each record contains the offset of the
corresponding main file record from the beginning of the main file. Attributes are held in a dBASE format file with dbf postfix. Each attribute record has a one-to-one relationship with the associated shape record.  I will write a content for how to read  DBF file later.  The data could be like this(all letters in a file name had better be in lower case):

Main file: counties.shp
Index file: counties.shx
dBASE table: counties.dbf

The main file (.shp) contains a fixed-length file header followed by variable-length records. Each variable-length record is made up of a fixed-length record header followed by variable-length record contents. Such as:

File Header
Record Header | Record Contents
Record Header | Record Contents
..................
Record Header | Record Contents

All the contents in a shapefile can be divided into two categories:
Data related
· Main file record contents
· Main file header’s data description fields (Shape Type, Bounding Box, etc.)
File management related
· File and record lengths
· Record offsets, and so on
The integers and double-precision integers that make up the data description fields in the file header (identified below) and record contents in the main file are in little endian byte order. The integers and double-precision floating point numbers that make
up the rest of the file and file management are in big endian byte order. That is why you will only see some strange charactors if you open the file with text editor.

2. Load the Main File Header

The main file header is 100 bytes long. The table  shows the fields in the file header with their byte position, value, type, and byte order. In the table, position is with respect to the start of the file.


The value for file length is the total length of the file in 16-bit words (including the fifty 16-bit words that make up the header). Try to open the shapefile, the meaning of String formats such as 'N', 'V'  and 'd' please refer to PHP handbook.

$handle = fopen($file_name, "rb");
loadHeaders($handle);

function loadHeaders($handle)
{
    //find the position of File Length at byte 24, read 4 byptes with Integer type
    fseek($handle, 24, SEEK_SET);
    $fileLength = loadData("N", fread($handle, 4));

   
/find the position of Shape Type at byte 32,read 4 byptes with Integer type
    fseek($handle, 32, SEEK_SET);
    $shapeType = loadData("V", fread($handle, 4));
  
    //Then read 8 bytes for 4 times to get the boundingbox field
    $boundingBox = array();
    $boundingBox["xmin"] = loadData("d", fread($handle, 8 ) );
    $boundingBox["ymin"] = loadData("d", fread($handle, 8 ) );
    $boundingBox["xmax"] = loadData("d", fread($handle, 8 ) );
    $boundingBox["ymax"] = loadData("d", fread($handle, 8 ) );
}

function loadData($type, $data)
{
    if (!$data) return $data;
    $tmp = unpack($type, $data);
    return current($tmp);
}

The parameters of file header will be like this:

fileLength :25962
shapeType: 5
boundingBox:
(
    [xmin] => -117.122375488
    [ymin] => 14.5505466461
    [xmax] => -86.7350006104
    [ymax] => 32.7208099365
)


All the non-Null shapes in a shapefile are required to be of the same shape type. The values for shape type are as follows:

Value Shape Type
0 Null Shape
1 Point
3 PolyLine
5 Polygon
8 MultiPoint
11 PointZ
13 PolyLineZ
15 PolygonZ
18 MultiPointZ
21 PointM
23 PolyLineM
25 PolygonM
28 MultiPointM
31 MultiPatch


Currently, shapefiles are restricted to contain the same type of shape as specified above. In this case, shape type 0,1,3,5,8 are chosen to read.

3. Record Headers

The header for each record stores the record number and content length for the record. Record headers have a fixed length of 8 bytes. Table shows the fields in the file header with their byte position, value, type, and byte order. In the table, position is with respect to the start of the record.



The function will read the headers of each record:
    function loadStoreHeaders()
    {
        $this->recordNumber = loadData("N", fread($this->SHPFile, 4));
        $tmp = loadData("N", fread($this->SHPFile, 4)); //We read the length of the record
        $this->shapeType = loadData("V", fread($this->SHPFile, 4));
    }


Now we can know the shape type of each record and use different function to read the data.

    function loadFromFile($handle)
    {
        $this->SHPFile = $handle;
        $this->loadStoreHeaders();

        switch ($this->shapeType) {
            case 0:
                $this->loadNullRecord();
                break;
            case 1:
                $this->loadPointRecord();
                break;
            case 3:
                $this->loadPolyLineRecord();
                break;
            case 5:
                $this->loadPolygonRecord();
                break;
            case 8:
                $this->loadMultiPointRecord();
                break;
            default:
                break;
        }
    }

4. Read Main File Record Contents

Shapefile record contents consist of a shape type followed by the geometric data for the shape. The length of the record contents depends on the number of parts and vertices in a shape. For each shape type, we first describe the shape and then its mapping to record contents on disk.

4.1 Null Shapes

A shape type of 0 indicates a null shape, with no geometric data for the shape. Each feature type (point, line, polygon, etc.) supports nulls it is valid to have points and null points in the same shapefile.



Having no geodata, one empty array is returned.

    function loadNullRecord()
    {
        $this->SHPData = array();
    }


4.2 Point

A point consists of a pair of double-precision coordinates in the order X,Y.

Point
{
Double X // X coordinate
Double Y // Y coordinate
}



Because the method to read the point coordinate is same for every type, we could define a function to read the point coordinate.

    function loadPoint()
    {
        $data = array();
        $x1 = loadData("d", fread($this->SHPFile, 8));
        $y1 = loadData("d", fread($this->SHPFile, 8));
       //leave one space behind y1 to add other coordinate later
        $data["pointString"] = "$x1 $y1 ";
        return $data;
    }

And then read the point contents, there is no boundingbox and number of points information here, but we can add such infomation manually in the array.
    function loadPointRecord()
    {
        $data = $this->loadPoint();
        $tmp = explode(" ", $data["pointString"]);
        $this->SHPData["xmin"] = $this->SHPData["xmax"] = $tmp[0];
        $this->SHPData["ymin"] = $this->SHPData["ymax"] = $tmp[1];
        $this->SHPData["numparts"] = 1;
        $this->SHPData["numpoints"] = 1;
        $this->SHPData["parts"][0]["pointString"] = $data["pointString"];
    }

4.3 MultiPoint

A MultiPoint represents a set of points, as follows:
MultiPoint
{
Double[4] Box // Bounding Box
Integer NumPoints // Number of Points
Point[NumPoints] Points // The Points in the Set
}

The Bounding Box is stored in the order Xmin, Ymin, Xmax, Ymax with the 8 bytes size.

The function is as following
    function loadMultiPointRecord()
    {
        $this->SHPData = array();
        $this->SHPData["xmin"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["ymin"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["xmax"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["ymax"] = loadData("d", fread($this->SHPFile, 8));

        $this->SHPData["numpoints"] = loadData("V", fread($this->SHPFile, 4));

        for ($i = 0; $i <= $this->SHPData["numpoints"]; $i++) {
            $data = $this->loadPoint();
            $this->SHPData["pointString"] .= $data["pointString"];
        }
    }

4.4 PolyLine

A PolyLine is an ordered set of vertices that consists of one or more parts. A part is a connected sequence of two or more points. Parts may or may not be connected to one another. Parts may or may not intersect one another. Because this specification does not forbid consecutive points with identical coordinates, shapefile readers must handle such cases. On the other hand, the degenerate, zero length
parts that might result are not allowed.
PolyLine
{
Double[4] Box // Bounding Box
Integer NumParts // Number of Parts
Integer NumPoints // Total Number of Points
Integer[NumParts] Parts // Index to First Point in Part
Point[NumPoints] Points // Points for All Parts
}

NumParts The number of parts in the PolyLine.
NumPoints The total number of points for all parts.
Parts An array of length NumParts. Stores, for each PolyLine, the index of its first point in the points array. Array indexes are with respect to 0.

The following function will be a lit bit complex, take it easy.
    function loadPolyLineRecord()
    {
        $this->SHPData = array();
        $this->SHPData["xmin"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["ymin"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["xmax"] = loadData("d", fread($this->SHPFile, 8));
        $this->SHPData["ymax"] = loadData("d", fread($this->SHPFile, 8));

        $this->SHPData["numparts"] = loadData("V", fread($this->SHPFile, 4));
        $this->SHPData["numpoints"] = loadData("V", fread($this->SHPFile, 4));

        for ($i = 0; $i < $this->SHPData["numparts"]; $i++) {
            $this->SHPData["parts"][$i] = loadData("V", fread($this->SHPFile, 4));
        }

        $firstIndex = ftell($this->SHPFile);
        $readPoints = 0;
        while (list($partIndex, $partData) = each($this->SHPData["parts"])) {
            if (!isset($this->SHPData["parts"][$partIndex]["pointString"]) || !is_array($this->SHPData["parts"][$partIndex]["pointString"])) {
                $this->SHPData["parts"][$partIndex] = array();
                // $this->SHPData["parts"][$partIndex]["pointString"] = array();
            } while (!in_array($readPoints, $this->SHPData["parts"]) && ($readPoints < ($this->SHPData["numpoints"])) && !feof($this->SHPFile)) {
                $data = $this->loadPoint();
                $this->SHPData["parts"][$partIndex]["pointString"] .= $data["pointString"];
                $readPoints++;
            }
        }

        fseek($this->SHPFile, $firstIndex + ($readPoints * 16));
    }

4.5 Polygon

A polygon consists of one or more rings. A ring is a connected sequence of four or more points that form a closed, non-self-intersecting loop. A polygon may contain multiple outer rings. The order of vertices or orientation for a ring indicates which side of the ring is the interior of the polygon. The neighborhood to the right of an observer walking along the ring in vertex order is the neighborhood inside the polygon. Vertices of rings defining holes in polygons are in a counterclockwise direction. Vertices for a single, ringed polygon are, therefore, always in clockwise order. The rings of a polygon are referred to as its parts.
Because this specification does not forbid consecutive points with identical coordinates, shapefile readers must handle such cases. On the other hand, the degenerate, zero length or zero area parts that might result are not allowed.
The Polygon structure is identical to the PolyLine structure, as follows:
Polygon
{
Double[4] Box // Bounding Box
Integer NumParts // Number of Parts
Integer NumPoints // Total Number of Points
Integer[NumParts] Parts // Index to First Point in Part
Point[NumPoints] Points // Points for All Parts
}



Uhmmmm, the description of the polygon seems to confuse me, but you can forget it. Because the read method is the same as read polyline.
    function loadPolygonRecord()
    {
        $this->loadPolyLineRecord();
    }

Don't invent the wheel two times!

5. Load data and visualization

If you are not clear about how to convert those geodata to screen coordinate and display it, please refer to my other article: GIS Developing Follow Me with PHP--Visualize Geodata and create map in raster image

$arrGeometry = loadShapeFile("mexico.shp") ; //capitals.shp  buildings  mexico.shp
// print_r($arrGeometry);

$im = imagecreatefromjpeg("earth_620.jpg");
$image_x = imagesx($im);
$image_y = imagesy($im);
$width = $image_x ;
$height = $image_y ;

$land = imagecolorallocate ($im, 0xF7, 0xEF, 0xDE);
$sea = imagecolorallocate ($im, 0xB5, 0xC7, 0xD6);
$red = imagecolorallocate ($im, 0xff, 0x00, 0x00);
// imagefilledrectangle($im,0,0,$image_sx,$image_sy,$sea);
foreach($arrGeometry as $poly) {
    $converted_points = array();
    $numparts = $poly["geom"]["numparts"];
    if ($numparts >= 1) {
        for($j = 0;$j < $numparts;$j++)
        // make "x y " to "x y"
        $points = trim($poly["geom"]["parts"][$j]["pointString"]);
        // print_r($points);
        $points = explode(" ", $points);
        $number_points = count($points);
        $i = 0;

        while ($i < $number_points) {
            $lon = $points[$i];
            $lat = $points[$i + 1];
            // echo $lon." ".$lat."\n";
            $pt = getlocationcoords($lat, $lon, $width, $height);
            $converted_points[] = $pt["x"];
            $converted_points[] = $pt["y"];
            $i += 2;
        }
        switch ($poly["shapetype"]) {
            case '0':// null
                break;
            case '1':// point
                imagefilledellipse($im, $converted_points[0], $converted_points[1], 4, 4, $red);
                break;
            case '3':// polyline
                imageline($im, $converted_points[0], $converted_points[1], $converted_points[2], $converted_points[3], $red);
                break;
            case '5':// polygon
                imagepolygon($im, $converted_points, $number_points / 2, $red);
                break;
            case '8':// multipoint
                imagefilledpolygon($im, $converted_points, $number_points / 2, $red);
                break;
        }
    }
}

header("Content-type: image/png");
imagepng($im);
imagedestroy($im);


The outputted geodata format should have such structure:

            [shapetype] => 5
            [geom] => Array
                (
                    [xmin] => -105.676963806
                    [ymin] => 18.9549999237
                    [xmax] => -101.524902344
                    [ymax] => 22.7647209167
                    [numparts] => 1
                    [numpoints] => 156
                    [parts] => Array
                        (
                            [0] => Array
                                (
                                    [pointString] => -101.524902344 21.8566398621 -101.588302612 21.7727794647 -101.54360199 21.6569404602 ....................................................................................524902344 21.8566398621
                                )

                        )

                )

And the outputted map should be like that if you use mexico border line:


This is geodata from capitals.shp:


The source code can be downloaded from here: visulizeGeodata.rar

Ref: ESRI Shapefile Technical Description
http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
bfShapeFiles-0.0.1 from ovidio AT users.sourceforge.net
Blog: 

Comments

Could you send it to my email:bl_bluelady@hotmail.com

thank you