(→Comments: mention use of TOhtml runtime) |
(Change <tt> to <code>, perhaps also minor tweak.) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 10: | Line 10: | ||
|rating=3/3 |
|rating=3/3 |
||
|category1=Encoding |
|category1=Encoding |
||
− | |category2= |
+ | |category2=File Handling |
+ | |category3=HTML |
||
}} |
}} |
||
If one needs to edit files encoded in multiple legacy encodings, then the Vim fileencodings option cannot help much. Some hacks can be used to put the file encoding in the file (see [[VimTip911]]). However, in the case of HTML files, the encoding information is often in the HTML file already, especially for non-Latin1 Web pages, for example: |
If one needs to edit files encoded in multiple legacy encodings, then the Vim fileencodings option cannot help much. Some hacks can be used to put the file encoding in the file (see [[VimTip911]]). However, in the case of HTML files, the encoding information is often in the HTML file already, especially for non-Latin1 Web pages, for example: |
||
+ | <pre> |
||
− | |||
⚫ | |||
− | |||
+ | </pre> |
||
⚫ | |||
− | |||
The following code can be put in vimrc to detect and use such an encoding specification: |
The following code can be put in vimrc to detect and use such an encoding specification: |
||
− | |||
<pre> |
<pre> |
||
if has('autocmd') |
if has('autocmd') |
||
Line 68: | Line 67: | ||
Please notice that the nested autocommand is used to ensure the syntax highlighting is OK and the remembered cursor position is still kept. |
Please notice that the nested autocommand is used to ensure the syntax highlighting is OK and the remembered cursor position is still kept. |
||
− | It is recommended to use < |
+ | It is recommended to use <code>set encoding=utf-8</code> in order to ensure successful encoding conversion. |
+ | |||
+ | ==Plugins== |
||
+ | *{{script|id=2721|text=AutoFenc.vim}} |
||
+ | *{{script|id=199|text=charset.vim}} |
||
+ | *{{script|id=1708|text=FencView.vim}} |
||
==Comments== |
==Comments== |
||
The following source code form is common for generated pages: |
The following source code form is common for generated pages: |
||
+ | <pre> |
||
− | + | <meta content="text/html …" http-equiv="Content-Type" > |
|
+ | </pre> |
||
+ | |||
This form will not be recognised. |
This form will not be recognised. |
||
− | It would be reasonable to limit the search to the document head, expressed as an absolute characters to scan. |
+ | It would be reasonable to limit the search to the document head, expressed as an absolute characters to scan. This restriction will cause pages containing lots of comments and white space in head to be left alone. I do not think this is much of a problem. |
− | This restriction will cause pages containing lots of comments and white space in head to be left alone. |
||
− | I do not think this is much of a problem. |
||
---- |
---- |
||
− | Version vim7.3_v7 of the {{help|prefix=no|:TOhtml}} plugin distributed with Vim includes an autoload function you could call that does a much more complete HTML-charset to Vim encoding conversion. |
+ | Version vim7.3_v7 or higher of the {{help|prefix=no|:TOhtml}} plugin distributed with Vim includes an autoload function you could call that does a much more complete HTML-charset to Vim encoding conversion. --[[User:Fritzophrenic|Fritzophrenic]] 16:30, November 15, 2010 (UTC) |
+ | :This is now done in the AutoFenc.vim plugin mentioned above. For an example, see the plugin code. --[[User:Fritzophrenic|Fritzophrenic]] 22:24, April 4, 2011 (UTC) |
||
− | |||
− | --[[User:Fritzophrenic|Fritzophrenic]] 16:30, November 15, 2010 (UTC) |
Latest revision as of 06:07, 13 July 2012
created 2005 · complexity advanced · author Wu Yongwei · version 6.0
If one needs to edit files encoded in multiple legacy encodings, then the Vim fileencodings option cannot help much. Some hacks can be used to put the file encoding in the file (see VimTip911). However, in the case of HTML files, the encoding information is often in the HTML file already, especially for non-Latin1 Web pages, for example:
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" >
The following code can be put in vimrc to detect and use such an encoding specification:
if has('autocmd') function! ConvertHtmlEncoding(encoding) if a:encoding ==? 'gb2312' return 'cp936' " GB2312 imprecisely means CP936 in HTML elseif a:encoding ==? 'iso-8859-1' return 'latin1' " The canonical encoding name in Vim elseif a:encoding ==? 'utf8' return 'utf-8' " Other encoding aliases should follow here else return a:encoding endif endfunction function! DetectHtmlEncoding() if &filetype != 'html' return endif normal m` normal gg if search('\c<meta[ \t\n]\+http-equiv=\("\?\)Content-Type\1[ \t\n]\+content="text/html;[ \t\n]*charset=[-A-Za-z0-9_]\+"[ \t\n]*>') != 0 let reg_bak=@" normal y$ let charset=matchstr(@", 'text/html; charset=\zs[-A-Za-z0-9_]\+') let charset=ConvertHtmlEncoding(charset) normal `` let @"=reg_bak if &fileencodings == '' let auto_encodings=',' . &encoding . ',' else let auto_encodings=',' . &fileencodings . ',' endif if charset !=? &fileencoding && \auto_encodings =~ ',' . &fileencoding . ',' silent! exec 'e ++enc=' . charset endif else normal `` endif endfunction " Detect charset encoding in an HTML file au BufReadPost *.htm* nested call DetectHtmlEncoding() endif
Please notice that the nested autocommand is used to ensure the syntax highlighting is OK and the remembered cursor position is still kept.
It is recommended to use set encoding=utf-8
in order to ensure successful encoding conversion.
Plugins[]
Comments[]
The following source code form is common for generated pages:
<meta content="text/html …" http-equiv="Content-Type" >
This form will not be recognised.
It would be reasonable to limit the search to the document head, expressed as an absolute characters to scan. This restriction will cause pages containing lots of comments and white space in head to be left alone. I do not think this is much of a problem.
Version vim7.3_v7 or higher of the :TOhtml plugin distributed with Vim includes an autoload function you could call that does a much more complete HTML-charset to Vim encoding conversion. --Fritzophrenic 16:30, November 15, 2010 (UTC)
- This is now done in the AutoFenc.vim plugin mentioned above. For an example, see the plugin code. --Fritzophrenic 22:24, April 4, 2011 (UTC)