Tuesday, February 5, 2019

Removing Duplicates from Multiple Columns in LibreOffice Calc (OpenOffice)

So I was trying to remove duplicates from a .csv in LibreOffice. I could do it no problem for a single column, following instructions referenced below, but not with multiple columns. Here's how I worked around the problem to get the result I needed. TL;DR -- you don't, you import the columns as one column and remove duplicates on the single column, which works just fine.

Instructions

  1. If you are using a .csv, you are good to go, skip to step 3.
  2. If not, save as a .csv or copy to a new file and save as .csv (if you have a workbook -- so you don't lose the 2nd/3rd worksheet information).
  3. Close the .csv and then open it again -- when prompted to import the file, select nothing under 'Separator Options' (defaults are comma, tab, and semicolon), so the content is loaded into one column. Note you may have to right-click and open with > LibreOffice Calc if that isn't your default .csv editor. You can open Calc then use File > Open if you'd prefer.
  4. *These instructions are available most anywhere and work at this point* Select column A > Data > More Filters > Standard Filter.
  5. Set 'Column A' = Not Empty.
  6. Expand the options and select 'No Duplicates' and select the 'Copy results to:' box and copy the results to B1.
  7. Delete Column A (right-click > Delete Columns) then save as new .csv (just in case) or copy the resulting content wherever you need it.

References

https://ask.libreoffice.org/en/question/127060/unique-cell-or-delete-the-lines-with-all-the-duplicate/ https://www.wikihow.com/Remove-Duplicates-in-Open-Office-Calc