Why Does Excel Open My CSV File With Strange Characters? (How to Fix Encoding Issues in 2026)

You exported a CSV from your CRM, opened it in Excel, and now Müller looks like Müller. Café shows up as Café. Every row is suddenly an alien language. If this has happened to you, and it has happened to almost anyone who has ever touched a CSV, the culprit is encoding, not corruption.

This guide explains exactly why Excel mangles CSV files, how to diagnose the encoding mismatch in under a minute, and three reliable ways to open your CSV correctly without losing data.

The Quick Fix (For People Who Just Want Their Data Back)

Before the explanation, here is the fix that works in nine out of ten cases:

  1. Open Excel with a blank workbook. Do not double-click the CSV.

  2. Go to the Data tab and click From Text/CSV.

  3. Select your file. In the import dialog, change File Origin to 65001: Unicode (UTF-8).

  4. Click Load.

Your special characters will appear correctly. If that did not work, the file may be UTF-16 or Windows-1252 encoded, and the explanation below will help you figure out which.

What Is Actually Happening Inside Your CSV File

A CSV file is plain text. Plain text needs an encoding, which is the rulebook the computer uses to translate raw bytes back into letters you can read. The two encodings that cause 99% of the trouble are:

  • UTF-8: the modern standard used by almost every web app, database, and modern operating system.

  • Windows-1252 (sometimes called ANSI in Excel): a legacy single-byte encoding still used by older Excel versions on Windows when no encoding hint is present.

When you double-click a CSV on Windows, Excel guesses. It usually guesses Windows-1252. If the file was saved as UTF-8, every accented character, emoji, currency symbol, or non-Latin letter gets misread as two or three Latin characters.

That is why Müller becomes Müller. The two-byte UTF-8 sequence for ü is being read as two separate Windows-1252 characters.

How to Tell Which Encoding Your File Uses

You have three quick ways to check.

Method 1: Use a text editor that shows encoding

Open the file in Notepad++, VS Code, or Sublime Text. The status bar shows the detected encoding. Notepad++ displays it in the bottom-right corner. VS Code shows it in the bottom status bar.

Method 2: Check for a BOM

A BOM (byte order mark) is an invisible three-byte sequence at the start of UTF-8 files: EF BB BF. Excel respects the BOM and will open BOM-prefixed UTF-8 files correctly with a double-click. Files exported from many web apps, including Google Sheets, Salesforce, and Stripe, often skip the BOM, which is why they break.

Method 3: Read the symptoms

You See

The File Probably Is

Müller, Café, ’

UTF-8 read as Windows-1252

Strange Asian glyphs in a Latin file

UTF-16 read as something else

?? in place of accents

Encoding lost during a previous save

The Three Reliable Ways to Open a CSV Correctly

Option 1: Power Query import (most reliable)

This is the method from the Quick Fix above. It works on Excel for Microsoft 365, Excel 2019, and Excel 2021. It lets you preview the data with the right encoding before loading.

The advantage is repeatability. Once Power Query imports the file, you can refresh it later and the encoding setting is remembered.

Option 2: Save the CSV with a BOM

If you control the export, add a BOM. In Python:

python

with open('export.csv', 'w', encoding='utf-8-sig', newline='') as f: writer = csv.writer(f) writer.writerows(data)

The utf-8-sig codec writes the BOM. Excel will then open the file correctly on a double-click.

Option 3: Convert the file first

If the export is out of your hands and you need a one-shot fix, convert the file before opening:

  • Notepad++: open the file, go to Encoding > Convert to UTF-8-BOM, save.

  • VS Code: click the encoding indicator in the status bar, choose Save with Encoding > UTF-8 with BOM.

  • iconv on Mac or Linux: iconv -f UTF-8 -t UTF-8 original.csv > fixed.csv, then prepend a BOM with printf '\xEF\xBB\xBF' | cat - fixed.csv > final.csv.

What About Numbers and Dates?

Encoding is only half of the trouble Excel causes with CSVs. The other half is over-eager type conversion. Excel will:

  • Strip leading zeros from product codes (00123 becomes 123).

  • Reformat long numeric IDs as scientific notation.

  • Convert anything that looks like a date into a date, including gene names like SEPT2.

The Power Query method also fixes this. In the import preview, click the column header type icon and force the column to Text. That is the only way to keep IDs and codes intact.

Why This Problem Has Lasted 30 Years

CSV is the most common data exchange format in business, but it has no formal standard for encoding declaration. RFC 4180, the closest thing to a spec, does not require the encoding to be declared inside the file.

Excel cannot read minds. When it sees a file ending in .csv, it has to guess. The guess defaults to whatever the system locale prefers, which on most Windows installs is still Windows-1252. Microsoft has improved this with each Excel release, and Office 365 now opens UTF-8 files with a BOM correctly. Files without a BOM remain a coin flip.

A Simple Workflow That Avoids the Problem Entirely

If you handle CSVs regularly, build this into your routine:

  1. Always export with UTF-8 with BOM if you have the option.

  2. Never double-click a CSV from an unknown source. Use Data > From Text/CSV and check the encoding in the preview.

  3. Keep a dedicated CSV editor for inspecting files before they reach Excel. There is a roundup of the best ones in The top 5 CSV editors.

  4. For automated pipelines, pin the encoding in your code rather than letting the OS decide.

Frequently Asked Questions

Why does my CSV look fine in Notepad but broken in Excel?

Notepad on Windows 11 detects UTF-8 automatically. Excel still defaults to Windows-1252 for files without a BOM. Same file, two different guesses, two different results.

Will saving as XLSX fix the encoding?

Yes, but only if Excel had it right when you saved. If the file already showed Müller, saving as XLSX freezes the corruption in place. Re-open the source CSV with the correct encoding first, then save as XLSX.

Does Google Sheets have the same problem?

Google Sheets handles UTF-8 without a BOM correctly by default. If your team is constantly fighting Excel encoding, Sheets removes that specific friction.

What is the difference between UTF-8 and UTF-8 BOM?

The data is identical. UTF-8 BOM adds a three-byte marker at the start that tells Excel the encoding. The marker is invisible in most editors and harmless in any UTF-8-aware tool.

Is there a way to make Excel default to UTF-8 forever?

On Microsoft 365 you can change the default in File > Options > Advanced > When importing CSV files use Unicode (UTF-8). The option has been available since late 2022. On older Excel versions, the answer is no, and Power Query remains the workaround.

Why does the BOM matter only sometimes?

Most modern tools (browsers, databases, scripting languages) detect UTF-8 automatically and treat the BOM as optional. Excel is the loud exception because it has to support decades-old corporate workflows that still write in Windows-1252.

The Bottom Line

Excel does not mangle CSV files out of malice. It guesses the encoding because the format does not tell it what to use. Once you understand that, the fix takes thirty seconds: import through Power Query, set UTF-8, and your data comes back exactly as it was exported. For files you produce yourself, write a BOM and the problem disappears for everyone downstream.

If you work with CSVs often, the small habit of never double-clicking them will save you hours every year.

Just looking for some example CSV or Excel files? We've got your back.