Why does Perl require the "print Content-type" line?
If you write CGI programs, you are probably using Perl.
I said so because Perl has very powerful string manipulation
capabilities that make it a popular choice among other scripting
languages. Perl is easy to learn too!
Having said that, when you wrote your first "Hello World"
program in Perl, you must have encountered the unfriendly
error page -- Internal Server Error. But after you
added the following line to your Perl program:
| print "Content-type: text/html\n\n";
|
, it then worked like magic!
In this article, we are going to discuss why Perl
requires the above line for your CGI programs to work while
other languages don't. If you are an experienced web developer,
you may already know the answer. In that case, I suggest you
check out other interesting articles
of ours. See. I don't want to waste your precious time here
:)
Alrighty. Before we indulge our vanity with the pearly Perl,
let's get down to earth and learn the basics. When we type
a URL (e.g. http://dmoz.org/) into our browser, it magically
displays the webpage, right? Did you ever think about what
happened under the covers? Let's uncover the mystery with
HttpRevealer.
Say, you type http://dmoz.org/ into your browser...
As soon as you hit "Enter", your browser connects to the
server (in this case, it's dmoz.org.) and asks for
the home page of the site. Upon receiving your browser's request,
the web server responds by returning the home page of the
site. Simple, huh?
Did that give you the impression that the browser and the
web server talk to each other? Yes, they do. They talk just
like we do. But they don't talk in English, they communicate
with the HTTP protocol (language).
Now that you have a basic understanding that your browser
has to communicate with the web server in order to get the
webpages you want to see, let's get technical and see what
actually happened:
The above is the screenshot of HttpRevealer. The upper pane
shows your browser's request while the lower pane shows the
web server's response.
Don't be frightened by the unfamiliar language. We will
be focusing on the simple (yet important) stuff only. If you
look at the very first line of the HTTP request in the upper
pane, it says:
In English, it's simply saying, "Hey, Mr. Web Server, I am
here to get the document located at your root www directory.
I speak the HTTP protocol version one point zero.".
Now, see how the web server responds:
HTTP/1.1 200 OK
.........
Content-Type: text/html <!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN">
...... |
The first line says, "Okei Dokey. I've got what you're looking
for. I speak the HTTP protocol version one point one.".
The third line says, "What I am gonna give you is an HTML
document. Get prepared!". The next blank line marks the
end of the HTTP headers. The fifth line is the beginning of
the HTML document.
Was that amazingly simple? But don't forget what we're here
for -- we want to know why Perl requires the "print Content-type"
line for your CGI program to work properly.
Say, you have a simplistic CGI program like this:
#!/usr/pkg/bin/perl
print "<html><body>Hello World</body></html>\n";
|
You can imagine, when you invoke the above CGI program from
your browser, it would talk to the webserver asking for the
file (e.g. bad.pl) in pretty much the same way as it did in
the above example. However, the webserver will have to do more
this time. It has to locate the file, execute it (since it's
a CGI script rather than a simple HTML file), and then return
whatever the Perl program outputs (in this case, it's just one
single line.). Here is the problem: The above Perl script doesn't
say what type of content its output is and the webserver doesn't
know what Content-type should be used with the response So,
the webserver decides to err out and says 500 Internal Server
Error:
To fix the problem, simply change your Perl script to:
#!/usr/pkg/bin/perl
print "Content-type: text/html\n\n";
print "<html><body>Hello World</body></html>\n";
|
The new line tells the webserver that the script's output is
of the type "text/html". So, the webserver knows what value
the Content-type header should carry this time and thus happily
performs its duty:

Under the covers, this is what happens ...

We still have another question unanswered: Why don't other
web server languages (e.g. ASP, PHP...etc) require this "Content-type"
line? The answer becomes quite obvious now. If you don't specify
the Content-type, say, in ASP, the webserver (i.e. IIS 4 or
IIS 5) boldly assumes that the output of the ASP is HTML and
always includes this:
in its HTTP response headers unless instructed otherwise.
You don't believe me? Good learning attitude! Say, you have
an ASP page like this:
<%
Response.Write "<html><body> Hello World!
</body></html>"
%> |
You invoke the ASP page from your browser:
See how the webserver (IIS 5) responds with HttpRevealer:
That's that. I know it's a bit too long for such simple
stuff but I hope you enjoyed the discussion. I found out the
above with HttpRevealer. You can
explore the web yourself too! [See
more info]
Steven Chau
Go back to the Index of Articles
All other marks are properties of their respective owners.
|