Tip: #1074 - Detect encoding from the charset specified in HTML files
Created: December 9, 2005 22:41 Complexity: advanced Author: Wu Yongwei Version: 6.0 Karma: 3/3 Imported from: Tip#1074
If one needs to edit files encoded in multiple legacy encodings, then the Vim fileencodings option cannot help much. Some hacks can be used to put the file encoding in the file (see Tip #911). However, in the case of HTML files, the encoding information is often in the HTML file already, esp. for non-Latin1 Web pages, i.e.:
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
The following code can be put in _vimrc to detect and use such encoding specification:
code begins -----
if has('autocmd')
function! ConvertHtmlEncoding(encoding)
if a:encoding ==? 'gb2312'
return 'cp936' " GB2312 imprecisely means CP936 in HTML
elseif a:encoding ==? 'iso-8859-1'
return 'latin1' " The canonical encoding name in Vim
elseif a:encoding ==? 'utf8'
return 'utf-8' " Other encoding aliases should follow here
else
return a:encoding
endif
endfunction
function! DetectHtmlEncoding()
if &filetype != 'html'
return
endif
normal m`
normal gg
if search('\c<meta http-equiv=\("\?\)Content-Type\1 content="text/html; charset=[-A-Za-z0-9_]\+">') != 0
let reg_bak=@"
normal y$
let charset=matchstr(@", 'text/html; charset=\zs[-A-Za-z0-9_]\+')
let charset=ConvertHtmlEncoding(charset)
normal ``
let @"=reg_bak
if &fileencodings ==
let auto_encodings=',' . &encoding . ','
else
let auto_encodings=',' . &fileencodings . ','
endif
if charset !=? &fileencoding &&
\auto_encodings =~ ',' . &fileencoding . ','
silent! exec 'e ++enc=' . charset
endif
else
normal ``
endif
endfunction
" Detect charset encoding in an HTML file
au BufReadPost *.htm* nested call DetectHtmlEncoding()
code ends -----
Please notice that the nested autocommand is used to ensure the syntax highlighting is OK and the remembered cursor position is still kept.
It is recommended to use `set encoding=utf-8' in order to ensure successful encoding conversion.
Comments
Remember the final 'endif'...
wolcendo--AT--friko2.onet.pl , December 21, 2005 3:32