You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Work Item / Issue Reference
<!--
IMPORTANT: Please follow the PR template guidelines below.
For mssql-python maintainers: Insert your ADO Work Item ID below (e.g.
AB#37452)
For external contributors: Insert Github Issue number below (e.g. #149)
Only one reference is required - either GitHub issue OR ADO Work Item.
-->
<!-- mssql-python maintainers: ADO Work Item -->
>
[AB#39793](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/39793)
<!-- External contributors: GitHub Issue -->
> GitHub Issue: #<ISSUE_NUMBER>
-------------------------------------------------------------------
### Summary
<!-- Insert your summary of changes below. Minimum 10 characters
required. -->
1. Large result sets (100K+ rows) improved by up to ~2×
2. Very large result sets (~1.2M rows) improved by 1.4× to 1.7×
3. Complex joins and aggregation workloads improved by ~40–45%
4. General query workloads now operate at parity or better compared to
pyodbc
This pull request introduces several performance and correctness
improvements to the MSSQL Python driver, focusing on efficient row
conversion, cursor behavior, and pybind object handling. The most
significant changes include caching output converters and column maps
for rows, enforcing forward-only cursor semantics, and optimizing Python
type/class lookups in the C++ extension layer.
**Row conversion and cursor improvements:**
* Added caching of column name maps and output converter maps in the
`Cursor` class, so that row and column conversions are computed only
once per result set, improving performance for large queries. This
affects `fetchone`, `fetchmany`, and `fetchall` methods, as well as
execution logic.
[[1]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280L124-R131)
[[2]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280L826-R848)
[[3]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R1160-R1167)
[[4]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280L1961-R1994)
[[5]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R2038-R2046)
[[6]](diffhunk://#diff-deceea46ae01082ce8400e14fa02f4b7585afb7b5ed9885338b66494f5f38280R2080-R2088)
* Changed the cursor's scroll behavior to explicitly reject absolute
positioning for forward-only cursors, and implemented relative scrolling
using repeated fetches to match pyodbc's behavior.
**C++ extension (pybind) optimizations:**
* Introduced a `PythonObjectCache` namespace to cache commonly used
Python classes (datetime, date, time, decimal, uuid), reducing repeated
module imports and attribute lookups throughout the C++ codebase. All
parameter binding and result conversion logic now uses this cache.
[[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R37-R98)
[[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L461-R523)
[[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L478-R540)
[[4]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L491-R553)
[[5]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L535-R597)
[[6]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2005-R2067)
[[7]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2083-R2146)
* Added caching of the decimal separator in the result set conversion
logic to avoid repeated system calls during data retrieval.
**Cursor semantics and configuration:**
* Updated statement execution logic in the C++ layer to always configure
cursors as forward-only, ensuring consistent behavior and compatibility
with the Python cursor implementation.
[[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1422-R1488)
[[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1559-R1625)
These changes collectively improve performance and reliability when
fetching large result sets, ensure correct cursor behavior, and make the
C++ extension more efficient in its interaction with Python objects.
<!--
### PR Title Guide
> For feature requests
FEAT: (short-description)
> For non-feature requests like test case updates, config updates ,
dependency updates etc
CHORE: (short-description)
> For Fix requests
FIX: (short-description)
> For doc update requests
DOC: (short-description)
> For Formatting, indentation, or styling update
STYLE: (short-description)
> For Refactor, without any feature changes
REFACTOR: (short-description)
> For release related changes, without any feature changes
RELEASE: #<RELEASE_VERSION> (short-description)
### Contribution Guidelines
External contributors:
- Create a GitHub issue first:
https://github.com/microsoft/mssql-python/issues/new
- Link the GitHub issue in the "GitHub Issue" section above
- Follow the PR title format and provide a meaningful summary
mssql-python maintainers:
- Create an ADO Work Item following internal processes
- Link the ADO Work Item in the "ADO Work Item" section above
- Follow the PR title format and provide a meaningful summary
-->
---------
Co-authored-by: Gaurav Sharma <sharmag@microsoft.com>
Co-authored-by: Jahnvi Thakkar <61936179+jahnvi480@users.noreply.github.com>
0 commit comments