Vim Tips Wiki
Advertisement
Tip 242 Printable Monobook Previous Next

created 2002 · complexity intermediate · version 6.0


Vim can search for text that spans multiple lines. For example, the search /hello\_sworld finds "hello world" in a single line, and also finds "hello" ending one line, with "world" starting the next line.

This tip shows how to search over multiple lines, and presents a useful command so entering :S hello world finds "hello" followed by "world" separated by spaces or tabs or newlines, and :S! hello world allows any non-word characters, including newlines, between the words.

Patterns including end-of-line

The search /^abc finds abc at the beginning of a line, and /abc$ find abc at the end of a line. However, in /abc^def and /abc$def the ^ and $ are just ordinary characters with no special meaning. By contrast, each of the following has a special meaning anywhere in a search pattern.

\n a newline character
\_s a whitespace (space or tab) or newline character
\_^ the beginning of a line (zero width)
\_$ the end of a line (zero width)
\_. any character including a newline

Example searches:

/abc\n*def
Finds abc followed by zero or more newlines then def.
Finds abcdef or abc followed by blank lines and def.
The blank lines have to be empty (no space or tab characters).
/abc\_s*def
Finds abc followed by any whitespace or newlines then def.
Finds abcdef or abc followed by blank lines and def.
The blank lines can contain any number of space or tab characters.
There may be whitespace after abc or before def.
/abc\_$\_s*def
Finds abc at end-of-line followed by any whitespace or newlines then def.
There must be no characters (other than a newline) following abc.
There can be any number of space, tab or newline characters before def.
/abc\_s*\_^def
Finds abc followed by any whitespace or newlines then def where def begins a line.
There must be no characters (other than a newline) before def.
There can be any number of space, tab or newline characters after abc.
/abc\_$def
Finds nothing because \_$ is "zero width" so the search is looking for abcdef where abc is also at end-of-line (which cannot occur).
/abc\_^def
Finds nothing because \_^ is "zero width" so the search is looking for abcdef where def is also at beginning-of-line (which cannot occur).
/abc\_.\{-}def
Finds abc followed by any characters or newlines (as few as possible) then def.
Finds abcdef or abc followed by any characters then def.

Searching for multiline HTML comments

It is common for comments in HTML documents to span several lines:

<!-- This comment
 covers two lines. -->

The following search finds any HTML comment:

/<!--\_.\{-}-->

The atom \_. finds any character including end-of-line. The multi \{-} matches as few as possible (stopping at the first "-->"; the multi * is too greedy and would stop at the last occurrence).

Syntax highlighting may be not be accurate, particularly with long comments. The following command will improve the accuracy when jumping in the file, but may be slower (:help :syn-sync):

:syntax sync fromstart

Searching for words over multiple lines

The script below defines command :S that will search for a phrase, even when the words are on different lines. Examples:

:S hello world
Searches for "hello" followed by "world", separated by whitespace including newlines.
:S! hello world
Searches for "hello" followed by "world", separated by any non-word characters (whitespace, newlines, punctuation).
Finds, for example, "hello, world" and "hello+world" and "hello ... world". The two words can be on different lines.

After entering the command, press n or N to search for the next or previous occurrence.

Put the following in your vimrc (or in file searchmultiline.vim in your plugin directory):

" Search for the ... arguments separated with whitespace (if no '!'),
" or with non-word characters (if '!' added to command).
function! SearchMultiLine(bang, ...)
  if a:0 > 0
    let sep = (a:bang) ? '\_W\+' : '\_s\+'
    let @/ = join(a:000, sep)
  endif
endfunction
command! -bang -nargs=* -complete=tag S call SearchMultiLine(<bang>0, <f-args>)|normal! /<C-R>/<CR>

See also

References

Comments

Todo: Is following text worth keeping? Does the \_[abc] stuff work?

\_s is a different kind of beast. You can insert the underscore in any of the character-class atoms to include line-ends in the class. In this case the match position moves past a line-end when it matches. This means you can search for things like \_S\+ to match any sequence of NON-whitespace characters, even across multiple lines, or \_[abc] to match sequences of characters containing only the letters a, b, or c, that can span multiple lines.

I think the above is useful. I didn't "get" that the '_' was special (I thought it was a rendering bug. Sloppy reading really.) I guess at the top of this article I would "lead" with the great news (which I didn't know): You can add an underscore '_' in front of any character-class atom to add newline to the class. It even works for (say) [^a-zA-Z].

Given the above, with regards to the script, as a person who loves the flexibility of regular expressions, it seems more natural to just search for:

hello\_[^a-zA-Z]*world

Rather than have a special purpose script.

Advertisement