Wikia

Vim Tips Wiki

Remove diacritical signs from characters

Talk0
1,612pages on
this wiki
Revision as of 05:00, April 7, 2012 by JohnBot (Talk | contribs)

Tip 1675 Printable Monobook Previous Next

created August 31, 2011 · complexity basic · author Marcmontu · version 7.0


Characters such as 'á' or 'ç' (with diacritics) may be included in code comments and are successfully processed by many tools. However some tools do not work with these characters.

Instead of changing the tools, it is common to remove the diacritical signs (e.g. replace á with a, and ç with c). This tip provides a script to do the necessary substitutions in a single step.

Script

Create file ~/.vim/plugin/diacritics.vim (Unix) or $HOME/vimfiles/plugin/diacritics.vim (Windows) containing one of the scripts below, then restart Vim. Alternatively, add one of the scripts to your vimrc and restart Vim.

The following script uses Vim's tr() to translate characters with diacritics to characters without. It loads the buffer into a variable in memory and translates all characters in one operation, so it is efficient. However, there is no opportunity to review changes as they occur.

" Remove diacritical signs from characters in specified range of lines.
" Examples of characters replaced: á -> a, ç -> c, Á -> A, Ç -> C.
function! s:RemoveDiacritics(line1, line2)
  let diacs = 'áâãàçéêíóôõüú'  " lowercase diacritical signs
  let repls = 'aaaaceeiooouu'  " corresponding replacements
  let diacs .= toupper(diacs)
  let repls .= toupper(repls)
  let all = join(getline(a:line1, a:line2), "\n")
  call setline(a:line1, split(tr(all, diacs, repls), "\n"))
endfunction
command! -range=% RemoveDiacritics call s:RemoveDiacritics(<line1>, <line2>)

The following alternative script uses :s to search and replace, with an opportunity to confirm each change so it can be reviewed (or press a to proceed with all changes). The substitute uses a replacement expression (\=) to look up the translation character in a dictionary.

" Remove diacritical signs from characters in specified range of lines.
" Examples of characters replaced: á -> a, ç -> c, Á -> A, Ç -> C.
" Uses substitute so changes can be confirmed.
function! s:RemoveDiacritics(line1, line2)
  let diacs = 'áâãàçéêíóôõüú'  " lowercase diacritical signs
  let repls = 'aaaaceeiooouu'  " corresponding replacements
  let diacs .= toupper(diacs)
  let repls .= toupper(repls)
  let diaclist = split(diacs, '\zs')
  let repllist = split(repls, '\zs')
  let trans = {}
  for i in range(len(diaclist))
    let trans[diaclist[i]] = repllist[i]
  endfor
  execute a:line1.','.a:line2 . 's/['.diacs.']/\=trans[submatch(0)]/gIce'
endfunction
command! -range=% RemoveDiacritics call s:RemoveDiacritics(<line1>, <line2>)

Each alternative script above defines the :RemoveDiacritics command, and the command accepts a range which defaults to the whole buffer. Some examples follow (type the first couple of letters of the command then press Tab for command completion, or press the up arrow for command history):

:RemoveDiacritics               " whole buffer
:.RemoveDiacritics              " current line
:'<,'>RemoveDiacritics          " last selected range of lines

See ranges for more information.

References

Comments

Around Wikia's network

Random Wiki