getwd()
1 Introduction
When learning R, most people focus on functions, models, and visualizations. However, many real-world problems start much earlier — at the data import stage — and end much later — with exporting results.
If data is read incorrectly, no statistical method can save the analysis.
In this post, we focus on the logic of data import and export in R, using CSV and Excel files. Rather than memorizing functions, we build a mental model for how R interacts with files.
2 Why Data Import and Export Matters
Data analysis is a workflow:
Data source → Import → Analysis → Results → Export → SharingErrors often occur at the import stage:
wrong delimiters,
incorrect decimal separators,
incorrect file paths,
silently converted data types.
The result?
A model that runs perfectly — on the wrong data.
3 CSV vs Excel: Not a Competition
Before touching R, we should clarify the difference between file formats.
3.1 CSV Files
Plain text files
Lightweight and fast
Universally supported
One table per file
No formatting, only data
Example:
total_bill,tip,sex
16.99,1.01,Female
3.2 Excel Files
Binary format (
.xlsx)Can contain multiple sheets
Store structure and presentation together
Widely used for reporting and sharing
Key idea:
CSV is a data transport format.
Excel is a communication format.
4 Working Directory: Where R Actually Looks
One of the most common beginner mistakes has nothing to do with R syntax.
R does not search your entire computer for files. It only looks inside its working directory.
This command shows where R is currently looking.
If a file exists on your computer but not in this directory, R behaves as if the file does not exist.
This is why errors like:
cannot open the connection
usually indicate a path problem, not a coding problem.
5 The Example Dataset: tips
Throughout this post, we use a single dataset: tips.
Restaurant tipping data
Small and easy to understand
Contains numeric and categorical variables
Ideal for demonstrating import/export logic
Data source:
https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv
6 Reading CSV Files: The Core Logic
When R reads a CSV file, it needs answers to four questions:
How are columns separated?
Is the first row a header?
What is the decimal separator?
How should text be interpreted?
These answers are provided via function arguments.
7 read.table(): The Foundation
All CSV-reading functions in base R are built on read.table().
tips <- read.table(
file = "tips.csv",
header = TRUE,
sep = ",",
dec = ".",
stringsAsFactors = FALSE
)Understanding this function means understanding CSV import in R.
8 read.csv() and Its Assumptions
read.csv() is simply a shortcut for a common case:
Columns separated by commas
Decimal separator is a dot
tips <- read.csv("tips.csv")This works perfectly — if the assumptions match the file.
The dangerous part? R may not throw an error even if the assumptions are wrong.
The most dangerous errors are silent ones.
9 read.csv2() and Regional Differences
In many European datasets:
Columns are separated by semicolons
Decimals use commas
total_bill;tip;sex
16,99;1,01;Female
For this structure, read.csv2() is designed.
tips2 <- read.csv2("tips_semicolon.csv")Important nuance:
Even if decimals use dots, read.csv2() may still work in some cases — but this is not guaranteed.
Correct approach:
Always inspect the file structure before choosing the function.
10 Writing CSV Files from R
Data analysis rarely ends in R. Results are shared as files.
10.1 Writing comma-separated CSV
write.csv(tips, "tips_comma.csv", row.names = FALSE)10.2 Writing semicolon-separated CSV
write.csv2(tips, "tips_semicolon.csv", row.names = FALSE)Choosing the correct format depends on who will read the file next.
11 Why We Still Need Excel
CSV is technically superior in many ways. Yet Excel remains dominant in practice.
Why?
Multiple tables in one file
Familiar interface for non-technical users
Common reporting format
Excel is not an analysis tool — but it is a powerful delivery tool.
12 Working with Excel in R: openxlsx
The openxlsx package allows Excel operations without requiring Excel itself.
library(openxlsx)12.1 Writing a simple Excel file
write.xlsx(tips, "tips.xlsx", sheetName = "tips")12.2 Reading from Excel
tips_excel <- read.xlsx("tips.xlsx", sheet = 1)13 Multiple Sheets: A Mini Report
Excel shines when organizing related tables.
summary_tips <- aggregate(tip ~ day, data = tips, mean)
wb <- createWorkbook()
addWorksheet(wb, "Raw Data")
writeData(wb, "Raw Data", tips)
addWorksheet(wb, "Summary")
writeData(wb, "Summary", summary_tips)
saveWorkbook(wb, "tips_report.xlsx", overwrite = TRUE)One file.
Multiple views.
Clean structure.
14 Common Mistakes to Watch For
Most errors are not caused by R, but by assumptions:
Incorrect working directory
Wrong delimiter (
sep)Wrong decimal separator (
dec)Reading the wrong Excel sheet
Overwriting files unintentionally
A healthy habit after every import:
head(data)
str(data)
summary(data)15 Final Thoughts
If you can:
read data correctly,
write data consciously,
choose file formats intentionally,
you have already crossed one of the most important thresholds in data analysis.
For a complementary discussion, you may also find this article useful:
https://medium.com/p/e730f4a84b3b
Extended version on Medium:
https://medium.com/@Fatih.Tuzen/understanding-data-import-and-export-in-r-working-with-csv-and-excel-files-6322e61049b2