Recently, I carried out a Linux server upgrade for a client where I did a clean install of the latest operating system/distribution. Since it was a clean install I had to backup and note down the earlier configuration (of mail server, web server, database) and redo those changes. Mostly I preferred not to simply overwrite with the backed up configuration files. I rather documented and edited the configuration manually.
It all seemed to have gone smoothly and the new server has been up and running. But one not-so-fine day, the client started complaining that some HTML pages are not displaying properly. These were showing question marks (?) and some other weird characters. I figured out that these HTML pages were generated using Microsoft Word and had those special characters (closing quotes, double hyphens etc.). I discussed with the client that this could be a web browser problem because it is not able to use the correct character set.But the client insisted that such pages used to display properly earlier, before the upgrade. This meant I have missed redoing some configuration. The obvious suspect was at Apache web server side. After all it is Apache which is serving these web pages to the browser. Hence, if the culprit is not the web browser then it better be the web server. After parsing through the Apache’s configuration file I spotted a comment against a directive (or configuration option) called AddDefaultCharset which said:
# Specify a default charset for all content served; this enables
# interpretation of all content as UTF-8 by default. To use the
# default browser choice (ISO-8859-1), or to allow the META tags
# in HTML content to override this choice, comment out this
# directive:
I followed it and commented out the directive and voila it worked!
To sum up, if HTML files served by an Apache web server are not displaying special characters properly in a web browser (IE, Firefox etc.), the solution is:
Email This Post
⋅
Print This Post
⋅
Post A Comment
@shekharg
It’s a nice experience!
But can u guide me why Microsoft word’s special character are creating such type of problem in Apache server?
loading...
The web browser uses a META tag, written in the web page, to choose the character set (http://en.wikipedia.org/wiki/Character_encoding) to display the characters or content in the web page.
For example, the following is the META tag found in a HTML file (web page) generated by MS Word:
With this tag the web page instructs the web browser to use windows-1252 to display the page. With this character set, the browser can display all the, so called, special characters properly.
Apache (atleast the default configuration) tells the browser to use UTF-8 (http://en.wikipedia.org/wiki/UTF-8) as the character set. The browser respects the web server more than the web page
Commenting out this line tells the web browser to use whatever character set mentioned in the META tag of the page.
loading...
Very helpful.
You saved many hours of me.
Thanks.
loading...
Yep it also took me lots of “time, energy, brain” = googling
loading...