GEOG5870/1M: Web-based GIS A course on web-based mapping

General: Uploading data

We have now seen how to draw down data from our Postgres database and display it on a map. However, what if we want to allow users to upload data, either because of a crowdsourcing project, or, for example, an in-the-field survey? This kind of thing is simple enough to write, but fraught with potential security issues. First we'll look at the basic mechanics, then we'll look at building something with the most security we can muster.

First let's look at the core elements of a system. Here's a zip file containing a webpage and php page; download it and unzip it somewhere but DON'T upload it to the server.

The first thing we need is some kind of interface for the user to enter data; our webpage provides that. Open it up in Notepad++ and a web browser and have a look at it. The first thing you should see is that it doesn't use any JavaScript. Instead, it uses an older client-server technology known as the Common Gateway Interface, or CGI. CGI is standard for transmitting data to servers and getting programs on the server to use the data. In this case we are doing this with the HTTP POST statement. This statement sends data to a server encoded invisibly (but not, with effort, unreadably) in the HTTP transmission (c.f. GET and PUT statements, that encode the information in URLs – we'll see an example later).

In our case, the POST statement will POST the server the data from an HTML FORM. Forms are made of sets of HTML tags that encode text boxes, buttons, etc. In most HTML-only cases, a FORM is sent by the user clicking an INPUT tag of type submit. These are usually displayed by the browser as buttons. This then enacts the FORM tag's method attribute to send the data to the FORM tag's action attribute – usually the web address of a server-side script. Again, all this happens using standard HTML/HTTP, there's no JavaScript involved, just as there's no JavaScript involved in a standard link issuing a GET request.

Note two additional things: 1) each FORM element that can be filled with data has a name (in the case below, 'user'):

<INPUT type='text' class='boxes' name='user'>

Secondly, that if you want to pass hidden information to a script, you can use a hidden input, thus:

<INPUT type='hidden' class='boxes' name='stuff' value='2342'>

This is useful if you want to pass data on through a chain of FORMs – you just get the script dealing with the first form to make the page containing the second form, with all the data from the first form embedded into it in hidden tags so they, in turn, gets passed to the second form's processing script (though if you don't also save the data, this only works if you're sure users are going to complete all the forms).


So, anyhow, how does CGI allow us to then access the data sent? The web server will write the data into an environment variable (called QUERY_STRING) that the script can access and interrogate. Open up the PHP file and have a look at it.

You should be able to see that the data sent to the script (in PHP) is converted by PHP from the QUERY_STRING (which we never see) and held in an associative array ($_POST) which you can look things up in by using a name, thus:

$person = $_POST['user'];

Unlike the script in the zip file, here we've used a different variable and name so you can see distinction. The name ('user') is the name given to the FORM element where the data was entered (or a hidden element). We just attach it to a variable (in this case $person) and we can use the data. Neat, huh?

Now let's look at the rest of the script. It should be fairly self-explanatory. We open a connection to the database using the username and password supplied by the FORM, construct a SQL query to find out the current last primary key, issue the query, and increment the variable now containing a copy of the primary key – we'll use this in a bit for the new row's primary key.

Next, we make ourselves an SQL statement using the SQL command INSERT INTO tableName (columnsToInsertInto) VALUES(valuesToInsert). We use this to insert our new values (note the treatment of text with single-quotes (why don't we use double quotes?)). Finally we check something suitably positive has come back and close the connection.

Upload this file to the server if you like and give it a go – it should work fine. Use the database username and password. Overall, this seems like a perfectly decent solution; our server is closed to off-campus access, even through Port 80, and we can track any nare-do-wells inside the Uni system. However, all is not well with this script, not well at all. To understand why, we need to understand a bit more about attacks on webservers and web security. It is worth taking a bit of time out to look through the web security resources we have provided and work through the 'Uploading data II' tasks if you have some time.


[ Next: Introduction to web security ]