2017-04-18

Package to read and write Excel workbook in R: openxlsx or xlsx?

Recently I tried both packages, xlsx and openxlsx in R to read and write Excel workbooks (in xlsx format). Here I list a few things that I find useful when choosing which one to use based on the aim and context. These are based on xlsx 0.5.7 and openxlsx 4.0.0.

1. For some reason, xlsx uses more memory and runs slower than openxlsx, perhaps due to the use of Java runtime by xlsx. When dealing with data tables that are bigger than a few MB on a computer with very limited RAM, it's better to use openxlsx.

2. When writing contents with languages other than English, sometimes the special letters in UTF are not properly encoded when they are retrieved from database. In this case, the package xlsx usually will make everything right without your attention when writing to workbook. With openxlsx, you have to parse and get the UTF contents correctly into your local memory, before they are written to workbook. (This issue is unlikely to happen within an OS with any graphic UI. If it happens, the use of hex and unhex functions in the SQL queries can solve most problems.)

3. Want to add pictures (jpg or png) in workbook? Use xlsx.

4. With xlsx, column widths can be automatically set after the data frame is added. In openxlsx, they can only be manually set with explicit values.

5. To read out part of a worksheet to be then formatted as an html table, openxlsx is the better option.

6. To apply more than one cell style (like font style, background color, boarder etc.) into a few cell ranges in a worksheet, openxlsx is much easier to use and running faster with less memory than xlsx.

7. When none of the above special requirements are worried about, openxlsx is much handier to install and to use.

没有评论:

发表评论