Skip to content

Conversation

@Chiwendaiyue
Copy link
Contributor

Problem
fix#62998
When using DataFrame.query() to filter string columns based on list membership, a TypeError occurs due to multiline string representation of dtypes in internal methods.
Reproducible Example:

df = DataFrame({
    'Country': ['Abkhazia', 'Afghanistan', 'Albania', 'Algeria', 'American Samoa'],
    'GDP': [np.nan, 1.946902e+10, 1.186387e+10, 1.590490e+11, np.nan]
})
un = ['Afghanistan', 'Albania', 'Algeria']

Previously raised: TypeError: dtype ' object\n object\ndtype: object' not understood

result = df.query(f"Country in {un}")
Solution
Modified _get_cleaned_column_resolvers in pandas/core/generic.py to detect and handle the specific case of multiline dtype string representations before creating Series objects.

@Chiwendaiyue Chiwendaiyue marked this pull request as draft November 6, 2025 10:36
@Chiwendaiyue Chiwendaiyue marked this pull request as ready for review November 6, 2025 11:38
@Chiwendaiyue Chiwendaiyue deleted the fix/query-issue-dtype-handling branch November 6, 2025 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant