Vim can be used to detect the file encoding used in a particular file (for example, utf-8, utf-16le, or latin1). This tip shows an alias to invoke Vim with suitable arguments to check the encoding used for a specified file.
Bash shell alias
Here is a simple alias for the Bash shell to display what Vim thinks is a file's encoding:
alias vimenc='vim -c '\''let $enc = &fileencoding | execute "!echo Encoding: $enc" | q'\'''
This saves having to open Vim, then open the file and check the file encoding, and exit. The alias requires a file as a parameter.
$ vimenc UTF-16.xml Encoding: utf-16le Press ENTER or type command to continue $ vimenc ISO-8859-1.xml Encoding: latin1 Press ENTER or type command to continue
When an existing file is read, Vim tries to interpret the bytes in the file as characters using each encoding specified in the 'fileencodings' option. The first encoding that produces no conversion error is used, and that encoding is reported as the file encoding by the alias shown above.
After using the global 'fileencodings' option to determine the file encoding, Vim stores the result in the buffer local option 'fileencoding' (the first option is plural, ending with an 's'; the second option is singular). In the alias, the let $enc = &fileencoding statement assigns the value of 'fileencoding' to an environment variable named enc (the '$' tells Vim to set an environment variable which is displayed by the echo command of the shell). :help expr-option :help :let-environment :help :!cmd
When the 'encoding' option is set to a Unicode value such as utf-8, the default for 'fileencodings' is "ucs-bom,utf-8,default,latin1" which will check, in order:
- Presence of a Unicode BOM.
- System locale's default character set.
- Latin1 (which will always work).
- Working with Unicode
- Detect encoding from the charset specified in HTML files
- Forcing UTF-8 Vim to read Latin1 as Latin1