Introduction
Working with international data in Google Sheets often means encountering characters with diacritics—letters such as á, è, ñ, or ü. While these symbols are perfectly valid in many languages, they can create problems when you need to match, sort, or export data to systems that only recognise plain Latin letters. Fortunately, Google Sheets provides several techniques to strip accents and replace them with their simple English equivalents. In this article we will explore why normalising diacritics is essential, walk through native spreadsheet functions, demonstrate a powerful REGEX‑based solution, show how to automate the process with a short Apps Script, and finish with practical tips to avoid common pitfalls. By the end, you’ll have a complete toolbox for clean, accent‑free data.
Understanding Diacritics and Why They Matter
Diacritics are visual markers added to base letters to indicate a change in pronunciation, stress, or meaning. In a spreadsheet, they behave like any other character, which means:
- Text‑matching formulas (e.g., VLOOKUP or FILTER) treat “café” and “cafe” as different strings.
- Sorting algorithms place accented entries in unexpected positions, breaking alphabetical order.
- Exported CSV files may lose the correct encoding, resulting in garbled characters in downstream applications.
Standardising data by converting diacritics to their base letters eliminates these inconsistencies and improves interoperability across platforms.
Built‑in Functions for Simple Replacements
Google Sheets includes the SUBSTITUTE function, which can replace one character with another. For a handful of known accents, you can nest multiple SUBSTITUTE calls:
- =SUBSTITUTE(SUBSTITUTE(A2,”á”,”a”),”é”,”e”)
- Continue adding layers for “í”, “ó”, “ú”, “ñ”, etc.
While this method works for short lists, it quickly becomes unwieldy as the number of characters grows. The advantage, however, is that it requires no scripting and updates instantly when the source cell changes.
Leveraging REGEXREPLACE for Bulk Conversion
The REGEXREPLACE function can handle groups of characters in a single expression, dramatically reducing formula length. A common pattern groups accented vowels together:
- =REGEXREPLACE(A2,”[áàâäãå]”,”a”)
- =REGEXREPLACE(previous_result,”[éèêë]”,”e”)
- Repeat for “i”, “o”, “u”, and special cases like “ñ” → “n”.
To keep the formula readable, you can wrap it in an ARRAYFORMULA and reference a helper table that maps each accent set to its base letter. This approach scales well for large datasets and can be copied across columns without modification.
Creating a Custom Script with Apps Script
For the most flexible solution, write a short Apps Script that uses JavaScript’s String.normalize() method. The script below removes all diacritical marks in a selected range:
- function removeDiacritics() {
- var sheet = SpreadsheetApp.getActiveSheet();
- var range = sheet.getActiveRange();
- var values = range.getValues();
- for (var r = 0; r < values.length; r++) {
- for (var c = 0; c < values[r].length; c++) {
- values[r][c] = values[r][c].toString().normalize(“NFD”).replace(/[\u0300-\u036f]/g,””);
- }
- }
- range.setValues(values);
- }
After saving the script, assign it to a custom menu or a button for one‑click conversion. This method works for any Unicode character, not only the Latin‑based accents covered by formulas.
Best Practices and Common Pitfalls
When normalising diacritics, keep these guidelines in mind:
- Preserve original data—always work on a copy or use a separate column, so you can revert if needed.
- Test on a small sample before applying to an entire sheet to ensure no unintended characters are altered.
- Watch out for language‑specific letters such as “ß” (German) or “ø” (Danish); they may need custom mapping rather than simple removal.
- Combine methods—use SUBSTITUTE for a few known exceptions, REGEXREPLACE for bulk vowel handling, and Apps Script for comprehensive Unicode cleaning.
- Document the process in a hidden “Read‑Me” sheet so collaborators understand why certain columns contain stripped characters.
Conclusion
Replacing accented characters with their plain English equivalents in Google Sheets is a practical step toward cleaner, more interoperable data. By first recognising the impact of diacritics on matching, sorting, and exporting, you can choose the right tool for the job—whether it’s a quick SUBSTITUTE chain, a compact REGEXREPLACE formula, or a robust Apps Script that leverages Unicode normalisation. Applying best practices such as working on copies, testing on subsets, and documenting your approach ensures the process is safe and repeatable. Armed with these techniques, you can confidently prepare multilingual datasets for any downstream system without the headaches that hidden accents often cause.









