File format
From Vim Tips Wiki
Tip 1585 Previous Next created February 16, 2008 · complexity basic · author Metacosm · version 7.0
See Change end-of-line format for other suggestions. May merge these tips later.
Contents |
[edit] The 'fileformat' option
'fileformat' ('ff' for short) controls the way that Vim handles different line-ending sequences. Vim recognizes three file formats: unix, dos, and mac. Each of these file formats differs in the line-ending character sequences used on disk. Unix uses a LF (line feed; ^J or 0x0A), Mac uses a CR (carriage return; ^M or 0x0D), and Dos uses CRLF (^M^J or 0x0D0A).
[edit] Format conversion
Assuming that the fileformat has been properly detected (see below), it is very simple to convert to a different format. Simply set 'ff' to the desired format name; for example, if you are editing a dos file and want to convert to unix, just do :set ff=unix. That's it. The next time the file is written, unix line ends will be used. Things are a little tougher when your file has different ending sequences on different lines; this situation is discussed below.
[edit] Mass conversion
You can do mass conversion from the shell like so:
vim +"argdo set ff=<format>" +wqa <files>
This will set 'fileformat' for each file in the argument list, then quit, saving all files.
See :help :argdo and :help -c if you need more detail on what this does and why.
[edit] ff Detection -- 'fileformats'
When reading a file, Vim tries to detect the proper fileformat. The way it does this is controlled by the 'fileformats' (ffs) option. 'ffs' contains a comma separated list of the formats to try. Each platform has its own default value for 'ffs:
- unix - ffs=unix,dos
- dos / windows - ffs=dos,unix
- mac - ffs=mac,unix,dos.
Basically, Vim tries to find a line-ending sequence which appears at the end of every line, and then uses that for 'ff. This process can get fooled in two ways:
- The true fileformat is not found in 'fileformats, eg a Mac file on a Unix platform using the defaults
- There is a mix of line-ending sequences. This can happen when mixing tools made for different platforms.
[edit] When detection goes awry
When 'ff' is not detected correctly, you will usually see part or all of the line ending sequences in the editor window. Here is a short illustration of what you see when a mismatch happens.
File contents:
Line 1 Line 2
unix line-ends, ff=mac:
Line 1^JLine 2^J
unix line-ends, ff=dos:
Line 1 Line 2
dos line-ends, ff=unix:
Line 1^M Line 2^M
dos line-ends, ff=mac:
Line 1 ^JLine 2 ^J
mac line-ends, ff=unix:
Line 1^MLine 2^M
mac line-ends, ff=dos:
Line 1^MLine 2^M
[edit] Fixing Detection Problems
If your file has consistent line endings throughout, but you have had a 'ff' detection problem, the best fix is to force Vim to use the correct format with the :e command:
:e ++ff=mac
[edit] Fixing Inconsistent Line Endings
If you have extra leading or trailing characters (^M in unix or ^J in mac), use :%s/\r// to remove them. If you have long lines (mac line-ends with ff=dos or unix, unix line ends with ff=mac), use :%s/\r/\r/g to replace the wrong line-ends with the correct line-ends. If you are curious, the reasoning behind this methodology appears below.
[edit] The Nitty-Gritty
:substitute can be tricky when dealing with CR and NL. The problem stems from the fact that Vim uses NL in memory to represent a Nul character (see :help NL-used-for-Nul). The result of this is that unexpected things tend to happen when you try to modify line-endings using :s. Vim help says that \r matches CR and \n matches an end-of-line. In fact, when you use :s, \r will match CR or LF, depending on the fileformat! Basically, it matches whichever control character (^M or ^J) you see in the editor window. Furthermore, in the replace pattern, \n will expand to a LF, which then gets interpreted as... a Nul! (see :help sub-replace-special). Fortunately, when \r is used in the replace pattern, it is interpreted as a line-end, which will then display (and get written) as desired.
[edit] Comments
VimTip736 implements the display of non-native fileformat in the statusline.
Excellent article. Forcing the file format (:e ++ff=mac) did the trick for me. I knew it was a mac-file, but ff=mac never seemed to change anything. I've been looking for this solution quite a while now ... Thanks
