Skip to content

Conversation

EduardMcfly
Copy link

Problem

PostgreSQL requires that when using DISTINCT, all columns in the ORDER BY clause must also be present in the SELECT clause. This causes issues when using Ransack with DISTINCT queries that include sorting.

Solution

This PR adds a new DistinctSortsProcessor class that:

  • Automatically detects when a query uses DISTINCT with ORDER BY
  • Adds necessary SELECT clauses for sort columns with unique aliases
  • Updates the ORDER BY clause to use the aliased column names
  • Maintains backward compatibility with existing queries

Changes

  • New file: lib/ransack/distinct_sorts_processor.rb - Core processor logic
  • Modified: lib/ransack/search.rb - Integrates the processor into the result method
  • Modified: lib/ransack.rb - Adds require for the new processor
  • New file: spec/ransack/distinct_sorts_processor_spec.rb - Comprehensive test coverage

Testing

The implementation includes extensive test coverage for:

  • Different types of sort expressions (Arel nodes, strings)
  • Complex sort strings with NULLS clauses
  • Integration with existing Ransack functionality
  • Edge cases and error conditions

Compatibility

This change is fully backward compatible and only activates when both DISTINCT and ORDER BY are present in a query.

Closes #429

- Add DistinctSortsProcessor to handle DISTINCT queries with ORDER BY clauses
- Fixes issue where PostgreSQL requires all ORDER BY columns to be in SELECT clause
- Automatically adds necessary SELECT clauses and aliases for sort columns
- Includes comprehensive test coverage for the new functionality
- Maintains backward compatibility with existing queries

Closes activerecord-hackery#429
- Detect existing SELECT expressions and reuse their aliases
- Refactor regex patterns and add helper methods
- Optimize performance by avoiding unnecessary query modifications
- Introduced a helper method to create distinct queries, improving readability.
- Updated test cases to use the new helper method for consistency.
- Refactored assertions to streamline the testing process and enhance clarity.
- Added new tests for processing sorts and extracting sort expressions.
- Improved handling of empty sorts and existing select aliases.

This refactor aims to enhance the maintainability and readability of the test suite for the DistinctSortsProcessor.
- Introduced `SqlExpressionParser` for splitting complex SQL expressions while respecting quotes and brackets.
- Refactored `process_sorts` method in `DistinctSortsProcessor` to handle string sorts and utilize the new parser.
- Improved handling of select values and sort expressions to enhance clarity and maintainability.
- Added comprehensive tests for the new SQL expression parsing functionality.

This update enhances the flexibility and robustness of sort processing in Ransack.
- Updated methods to enhance clarity and maintainability, including renaming `extract_sort_expression` to `extract_sort` and introducing a new `extract_sort_expression` method.
- Improved the `build_select_value` method to streamline the process of building select values from sorts.
- Added tests for new and existing methods to ensure robust handling of various sort expressions, including strings and Arel nodes.
- Enhanced the `matches_select_expression?` method to better match select strings with sort expressions.

This refactor aims to improve the overall functionality and maintainability of the DistinctSortsProcessor.
@scarroll32 scarroll32 requested a review from Copilot September 24, 2025 13:40
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes PostgreSQL compatibility issues with DISTINCT queries that include ORDER BY clauses by implementing a new processor that automatically adds necessary SELECT clauses and aliased columns to satisfy PostgreSQL's requirements.

  • Adds DistinctSortsProcessor to handle DISTINCT queries with ORDER BY clauses
  • Implements SqlExpressionParser utility for parsing complex SQL expressions
  • Integrates the processor into the Search#result method to automatically process compatible queries

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
lib/ransack.rb Adds requires for the new processor and utility classes
lib/ransack/search.rb Integrates processor into result method for DISTINCT queries with sorts
lib/ransack/distinct_sorts_processor.rb Core processor logic for handling DISTINCT/ORDER BY compatibility
lib/ransack/utilities/sql_expression_parser.rb Utility for parsing complex SQL expressions with proper quote/bracket handling
spec/ransack/distinct_sorts_processor_spec.rb Comprehensive test coverage for the processor
spec/ransack/utilities/sql_expression_parser_spec.rb Test coverage for the SQL expression parser utility
Comments suppressed due to low confidence (1)

spec/ransack/distinct_sorts_processor_spec.rb:1

  • String concatenation using + operator in a loop creates multiple intermediate string objects. Use string interpolation or Array#join for better performance: large_expr = \"COUNT(#{'SUM(' * 10}x#{')' * 10}, AVG(y)\"
require 'spec_helper'

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +166 to +167
new_selects = processed_sorts.filter_map { |sort_info| sort_info[:select_value] }
query.select_values += new_selects if new_selects.any?
Copy link
Preview

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter_map method creates a new array and iterates through all processed sorts. Consider using each with a conditional to avoid creating intermediate arrays when the processed_sorts array is large.

Suggested change
new_selects = processed_sorts.filter_map { |sort_info| sort_info[:select_value] }
query.select_values += new_selects if new_selects.any?
processed_sorts.each do |sort_info|
select_value = sort_info[:select_value]
query.select_values << select_value if select_value
end

Copilot uses AI. Check for mistakes.


def matches_select_expression?(select_str, sort_expression)
select_str.include?(sort_expression) ||
!!select_str.match(/#{Regexp.escape(sort_expression)}\s+AS\s+(\w+)/i)
Copy link
Preview

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double negation !! is unnecessary here since the method already returns a boolean from the || operation. The match method returns a MatchData object or nil, which are both truthy/falsy values that work correctly with ||.

Suggested change
!!select_str.match(/#{Regexp.escape(sort_expression)}\s+AS\s+(\w+)/i)
select_str.match(/#{Regexp.escape(sort_expression)}\s+AS\s+(\w+)/i)

Copilot uses AI. Check for mistakes.

return nil unless match

direction = match[1].upcase
nulls = match[2]&.upcase&.strip
Copy link
Preview

Copilot AI Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safe navigation operator chain &.upcase&.strip could potentially cause issues if match[2] is an empty string. Consider using a more explicit nil check: nulls = match[2]&.then { |s| s.upcase.strip if s && !s.empty? }

Suggested change
nulls = match[2]&.upcase&.strip
nulls = match[2]&.then { |s| s.upcase.strip if s && !s.empty? }

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug when sorting by association when using postresql with distinct(:true)
2 participants