Skip to content

BUG: Fix _replace_regex mutating StringArray in-place for non-inplace replace#64752

Open
jbrockmendel wants to merge 1 commit intopandas-dev:mainfrom
jbrockmendel:bug-string-astype
Open

BUG: Fix _replace_regex mutating StringArray in-place for non-inplace replace#64752
jbrockmendel wants to merge 1 commit intopandas-dev:mainfrom
jbrockmendel:bug-string-astype

Conversation

@jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Mar 21, 2026

This unblocks #57733

Block._replace_regex calls self.astype(np.dtype(object)) when the replacement value can't be held in the current dtype. For StringArray (backed by an object ndarray), this astype chain returns a block whose values share memory with the original StringArray._ndarray — because the underlying conversion is object→object with copy=False, which is a no-op. replace_regex then mutates this shared array in-place, corrupting the original.

Fix: call _maybe_copy(inplace=True) on the astype result before mutation, which copies only when the block shares refs with the original (i.e., when astype returned a view).

Reproducer:

df = pd.DataFrame({"b": list("ab..")})
df_orig = df.copy()
result = df.replace([r"\s*\.\s*", "b"], 0, regex=True)
# Before fix: df["b"]._ndarray contains ints, df != df_orig
# After fix: df is unchanged

🤖 Generated with Claude Code

… replace

closes pandas-dev#57733

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant